Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Java解压zip文件的踩坑之旅 #314

Open
TFdream opened this issue Sep 24, 2020 · 1 comment
Open

Java解压zip文件的踩坑之旅 #314

TFdream opened this issue Sep 24, 2020 · 1 comment

Comments

@TFdream
Copy link
Owner

TFdream commented Sep 24, 2020

今天上线商品批量上架功能(文件上传,包括Excel 和 zip格式文件), 让我帮忙看看怎么回事。

先看看线上日志有没有什么错误,果不其然, 后台果然报错了:

java.lang.IllegalArgumentException: MALFORMED at java.util.zip.ZipCoder.toString(ZipCoder.java:58) at java.util.zip.ZipFile.getZipEntry(ZipFile.java:583) at java.util.zip.ZipFile.access$900(ZipFile.java:60) at java.util.zip.ZipFile$ZipEntryIterator.next(ZipFile.java:539) at java.util.zip.ZipFile$ZipEntryIterator.nextElement(ZipFile.java:514) at java.util.zip.ZipFile$ZipEntryIterator.nextElement(ZipFile.java:495) 

首先Google了下这个异常的原因, 都说是因为编码的问题, 要求将UTF-8改成GBK就可以了。

    @Test
    public void testApp() throws IOException {
        Charset gbk = Charset.forName("GBK");
        ZipFile zipFile = new ZipFile("/Users/apple/Documents/自动上架图片_6_向上_20200924_上架商品_1.zip", gbk);
        for(Enumeration entries = zipFile.entries(); entries.hasMoreElements();){
            ZipEntry entry = (ZipEntry)entries.nextElement();
            System.out.println(entry.getName());
        }
    }

我将线上的zip文件解压后, 在自己电脑重新打个包(我用的Mac自带的压缩功能), 然后又运行了上述代码, 竟然解压成功了。 这是为什么? 于是上网上找了一下, 最终找到了答案:

Windows 压缩的时候使用的是系统的编码 GB2312,而 Mac 系统默认的编码是 UTF-8。

最后去问了上传的同事, 他是在Windows下用的winRar 压缩/解压缩的(看来不同的解压工具还不同)。

看到这里基本上问题就要解决了, 于是乎想到了ASF(做java的应该都晓得ASF吧),Google一下 Apache zip 第一个出现的就是Apache commons-compress,瞬间看到希望了。
image

Apache commons-compress 官网介绍如下:

Archivers and Compressors

Commons Compress calls all formats that compress a single stream of data compressor formats while all formats that collect multiple entries inside a single (potentially compressed) archive are archiver formats.

The compressor formats supported are gzip, bzip2, xz, lzma, Pack200, DEFLATE, Brotli, DEFLATE64, ZStandard and Z, the archiver formats are 7z, ar, arj, cpio, dump, tar and zip. Pack200 is a special case as it can only compress JAR files.

We currently only provide read support for arj, dump, Brotli, DEFLATE64 and Z. arj can only read uncompressed archives, 7z can read archives with many compression and encryption algorithms supported by 7z but doesn't support encryption when writing archives.

首先引入pom文件:

<dependency>
    <groupId>org.apache.commons</groupId>
    <artifactId>commons-compress</artifactId>
    <version>1.20</version>
</dependency>
@KangMz
Copy link

KangMz commented Jan 24, 2024

  1. 你似乎没有写完。
  2. 在你完成代码后,有进行过测试吗。 就我个人而言,使用这个包还是会出现编码不一致导致文件名乱码的问题。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants