Invalid EndOfStream c# gzip
if i try to read a gzip file with GZipStream on c# like this
string filePath = "....gz";
int count = 0;
using (FileStream reader = File.OpenRead(filePath))
using (var zip = new GZipStream(reader, CompressionMode.Decompress))
using (StreamReader unzip = new StreamReader(zip))
{
while (!unzip.EndOfStream)
{
var data = unzip.ReadLine();
count++;
}
}
Console.WriteLine(count);
i get less row than read decompressed csv file(decompress with Windows shell)
filePath = "...csv";
count = 0;
using (FileStream reader = File.OpenRead(filePath))
using (StreamReader unzip = new StreamReader(reader))
{
while (!unzip.EndOfStream)
{
var data = unzip.ReadLine();
count++;
}
}
Console.WriteLine(count);
The sample are in https://developers.thomsonreuters.com/elektron-data-solutions/datascope-select-rest-api/downloads
Any ideas, the Size and Packed size on gz archive is strange, Packed size are bigger than decompressed Size(on winrar ui)
Best Answer
-
I found a workaround for now, with ICSharpCode.SharpZipLib.GZip.GZipInputStream( https://github.com/icsharpcode/SharpZipLib) libs i read all lines, so the problem seems to be .net built-in GZipStream(or malformed files from trth).
1
Answers
-
M.ROSSIGNOLI, exactly which sample did you use ? Please note that the samples under the URL you posted are for DSS. For TRTH the samples are in https://developers.thomsonreuters.com/thomson-reuters-tick-history-trth/thomson-reuters-tick-history-trth-rest-api/downloads.
0 -
M.ROSSIGNOLI, what you observe reminds me of this issue. It was in Java, but the symptoms were similar: counting the number of lines of the file did not deliver the same thing when decompressing from the data stream from the server, or from a file saved on disk. Small data amounts worked fine, but with larger ones the end of the file was dropped. The issue was intermittent, so we had varying numbers of lines for what should have delivered a constant number of lines.
We found out that it was due to an issue with decompressing data on the fly, the popular public libraries we were using were not reliable enough to decompress large amounts of data flowing in through an input stream. After a long investigation we found other libraries that were more reliable. We also found a workaround, which was to first save the file to disk (without decompressing), and then reading it back from disk and decompressing at that time. That worked fine, without dropping data.
0 -
The sample above is from file on disk. I logged in DataScope download with browser(Chrome) .gz file and after run code. It's a very weird behaviour.
0 -
i mean "C# Example Application" Dss.Api.Examples.sln .net solution
0 -
I have found similar issue. It seems that either .Net GZipStream or DeflateStream somehow cannot completely decompress large gzip files generated by TRTH.
I use the SevenZipSharp to decompress the file as workaround. It requires both
SevenZipSharp.dll and 7-Zip 9.15 DLL-s files. To use it, you need to add the SevenZipSharp.dll as Reference and modify the path in the code to 7z.dll file's location. Below are the sample code.
//using (var zip = new GZipStream(reader, CompressionMode.Decompress))
SevenZip.SevenZipExtractor.SetLibraryPath(@<your local path>\7z.dll);
using (var extractor = new SevenZip.SevenZipExtractor(filePath))
using (MemoryStream ms = new MemoryStream())
{
int indexZip = extractor.ArchiveFileData.First().Index;
//Decompress result to memory stream
extractor.ExtractFile(indexZip, ms);
ms.Position = 0;
using (StreamReader unzip = new StreamReader(ms))
while (!unzip.EndOfStream)
{
var data = unzip.ReadLine();
count++;
}
}
Console.WriteLine(count);Hope this helps.
1 -
Ah yes, ok. That one is the same for DSS and TRTH.
0
Categories
- All Categories
- 6 AHS
- 39 Alpha
- 161 App Studio
- 4 Block Chain
- 4 Bot Platform
- 16 Connected Risk APIs
- 47 Data Fusion
- 30 Data Model Discovery
- 608 Datastream
- 1.3K DSS
- 577 Eikon COM
- 4.9K Eikon Data APIs
- 7 Electronic Trading
- Generic FIX
- 7 Local Bank Node API
- Trading API
- 2.7K Elektron
- 1.3K EMA
- 236 ETA
- 519 WebSocket API
- 33 FX Venues
- 10 FX Market Data
- 1 FX Post Trade
- 1 FX Trading - Matching
- 12 FX Trading – RFQ Maker
- 5 Intelligent Tagging
- 2 Legal One
- 20 Messenger Bot
- 2 Messenger Side by Side
- 9 ONESOURCE
- 7 Indirect Tax
- 59 Open Calais
- 264 Open PermID
- 39 Entity Search
- 2 Org ID
- PAM
- PAM - Logging
- 8.4K Private Comments
- 6 Product Insight
- Project Tracking
- ProView
- ProView Internal
- 20 RDMS
- 1.4K Refinitiv Data Platform
- 367 Refinitiv Data Platform Libraries
- 3 Refinitiv Due Diligence
- LSEG Due Diligence Portal API
- 3 Refinitiv Due Dilligence Centre
- Rose's Space
- 1.1K Screening
- 18 Qual-ID API
- 13 Screening Deployed
- 23 Screening Online
- 10 World-Check Customer Risk Screener
- 990 World-Check One
- 44 World-Check One Zero Footprint
- 45 Side by Side Integration API
- Test Space
- 3 Thomson One Smart
- 1.2K TR Internal
- Global Hackathon 2015
- 2 Specialists Who Code
- 10 TR Knowledge Graph
- 150 Transactions
- 142 REDI API
- 1.7K TREP APIs
- 4 CAT
- 21 DACS Station
- 117 Open DACS
- 1.1K RFA
- 103 UPA
- 172 TREP Infrastructure
- 224 TRKD
- 886 TRTH
- 5 Velocity Analytics
- 5 Wealth Management Web Services
- 59 Workspace SDK
- 9 Element Framework
- 5 Grid
- 13 World-Check Data File
- Yield Book Analytics
- 46 中文论坛