最近在写一个大量小文件直接压缩到一个zip的需求,由于zip中的entry每一个都是独立的,不需要追加写入,也就是一个entry文件,写一个内容,
因此直接使用了多线程来处理,结果就翻车了,代码给出了如下的错误:write beyond end of stream!
下面直接还原当时的代码场景:
1 public class MultiThreadWriteZipFile { 2 3 private static ExecutorService executorService = Executors.newFixedThreadPool(50); 4 5 private static CountDownLatch countDownLatch = new CountDownLatch(50); 6 7 8 @Test 9 public void multiThreadWriteZip() throws IOException, InterruptedException { 10 File file = new File("D:\\Gis开发\\数据\\影像数据\\china_tms\\2\\6\\2.jpeg"); 11 //创建一个zip 12 ZipOutputStream zipOutputStream = 13 new ZipOutputStream(new FileOutputStream(new File("E:\\java\\test\\test.zip"))); 14 15 for (int i = 0; i < 50; i++){ 16 String entryName = i + File.separator + i + File.separator + i + ".jpeg"; 17 executorService.submit(() -> { 18 try { 19 writeSource2ZipFile(new FileInputStream(file),entryName,zipOutputStream); 20 countDownLatch.countDown(); 21 } catch (IOException e) { 22 e.getLocalizedMessage(); 23 } 24 }); 25 } 26 //阻塞主线程 27 countDownLatch.await(); 28 //关闭流 29 zipOutputStream.close(); 30 } 31 32 33 public void writeSource2ZipFile(InputStream inputStream, 34 String zipEntryName, 35 ZipOutputStream zipOutputStream) throws IOException { 36 //新建entry 37 zipOutputStream.putNextEntry(new ZipEntry(zipEntryName)); 38 byte[] buf = new byte[1024]; 39 int position; 40 //entry中写数据 41 while((position = inputStream.read(buf)) != -1){ 42 zipOutputStream.write(buf); 43 } 44 zipOutputStream.closeEntry(); 45 zipOutputStream.flush(); 46 } 47 }
直接运行上面的代码就会报错:write beyond end of stream
将 private static ExecutorService executorService = Executors.newFixedThreadPool(50);
修改为
private static ExecutorSercvice executorService = Executors.newSingleThreadExecutor();
此时代码运行正常!
至于原因嘛,我们跟踪下代码也就明白其中的原因了,我们先来看报错的代码出处:
在java.util包下的DeflaterOutputStream的201行(jdk1.8,其它版本可能会有差异),我们来看代码
public void write(byte[] b, int off, int len) throws IOException { if (def.finished()) { throw new IOException("write beyond end of stream"); } if ((off | len | (off + len) | (b.length - (off + len))) < 0) { throw new IndexOutOzfBoundsException(); } else if (len == 0) { return; } if (!def.finished()) { def.setInput(b, off, len); while (!def.needsInput()) { deflate(); } } }
关键的原因就是def.finished()对应的状态信息,而这个状态是在Deflater这个类中定义的,这个类也是Java基于ZLIB压缩库实现的,一个压缩工具类。
而下面的这段代码就是改变这个状态的,
public void finish() { synchronized (zsRef) { finish = true; } }
而这个代码的调用之处,最源头就是我们上面的zipOutputStream.putNextEntry(new ZipEntry(zipEntryName)); 这行代码,
其实先思路,就是每次新增一个entry的时候,都需要将上一次的entry关闭掉,此时也就触发了这个条件,而这个状态并不是线程私有的,我们通过下面的代码就可以知道
public class Deflater { private final ZStreamRef zsRef; private byte[] buf = new byte[0]; private int off, len; private int level, strategy; private boolean setParams; private boolean finish, finished; private long bytesRead; private long bytesWritten;
因此在多线程下,这个状态肯定是线程不安全的!
好了本次关于多线程下写zip报错的问题,就介绍到这里!
标签:
留言评论