Cannot extract Canterbury Corpus
-
I downloaded the corpus files from http://www.cosc.canterbury.ac.nz/corpus/descriptions
but using PA 9.20.07 I cannot extract the “cantrbry.tar.gz” testfiles. (the other tar.gz archives extract OK).
Tried PA explorer view (which does display the files inside the archive) and windows explorer context menu.
In both cases I tried to extract to sub-folder “cantrbry”. The folder is created but it is empty. -
Hi,
it seems to work from main PA with explorer view, and extract button usage… what exactly did you try yourself?
thanks,
-
It doesnt seem to work extract to filename/ in explorer, first time you do it, but it works second time. It also works with Extract here, all the time. hmph.
-> it seems to have something with extracting tar file from gz in shell extensions, but seems to work in main PA since tar file is always opened when you open the archive.
thanks,
-
It doesnt seem to work extract to filename/ in explorer, first time you do it, but it works second time.
Yes, extracting just that file works for me now as well.
Strange - maybe related to sequence of operations?
@spwolf:it seems to work from main PA with explorer view, and extract button usage… what exactly did you try yourself?
As I remember, what I did was download all four testfiles and try extracting each into its own subdirectory, “artificial test”, “calgary test”, “canterbury test” and “large test”.
I then compressed each into sperate subdirectory using shell (right click on foldername)Zip plus Options… (Zip, max, deflate64) renaming output Zip file to {foldername} PA.zip
e.g. “artificial test PA.zip” etc.I noticed that “canterbury test PA.zip” wasn’t created, but there was no error message. Tried again - still nothing.
That’s when I checked and saw that the subfolder was empty. Thought it was me, wouldn’t be first time :rolleyes:Deleted subfolder and tried shell extract again. Checked - still no contents.
Deleted subfolder and double clicked to open the tar.gz in PA.
PA showed tar file in folder window and correctly listed all the files.
Used toolbar button to try extract (again to subfolder). Still no contents.
Tried toolbar again. Failed.
Came here to report it.Double hmph…
-
Heh, I think I know what might be wrong - it is something with gzip extraction in this case from shell ext - and since in main PA it is done differently, people would not notice it. I am sure we will fix it for next release.
thanks!