.PAF/.POWER Testing



  • I have opened this thread in reference for us to try out the new .PAF format.

    Below is a link to a large PDF this is an example in size and content to what my business and my friends business would have, Partylite is an example not what i personally deal with but it contains data, loads of pictures high and low graphics.

    Here are some stats,

    The original size of PDF is 48.9MB [51,351,506Bytes]

    Compressed with the default PA Settings in the following formats;

    Link to PDF
    http://omega.livedrive.com/item/91a1987e76934c91bc66b8727147b020

    Zip - 48.4MB [50,752,392 Bytes] 1.17% Ratio
    ZipX - 45.4MB [47,648,914 Bytes] 7.21% Ratio
    7z - 44.8MB [47,063,757 Bytes] 8.35% Ratio

    So from the above 7z Shows the best possible current method to compress this large PDF. Now what can PAF do?

    Over to you Guys…


  • conexware

    ok, although i would like to see what you actually deal with in business environment 🙂



  • @spwolf:

    ok, although i would like to see what you actually deal with in business environment 🙂

    Very similar, some of our PDF’s are PowerPoint presentations containing HD Pictures much like this. However, I can’t provide a real example for much of it as it contains confidential sales figuires and by removing those pages i am them reducing the size.

    However, this PDF i have as an example is similar in size towards my friends Marketing firm Proof’s of such they use Images and PDF’s that make up 10meg to 50 meg Documents.

    So its the same concept, the same size with the same varibles. 🙂

    so… how good is this PAF?


  • conexware

    probably 15%, but main deal here would be to also recompress images, as images are those that are 50 MB.

    but i will let you know of real result soon.



  • @spwolf:

    probably 15%, but main deal here would be to also recompress images, as images are those that are 50 MB.

    but i will let you know of real result soon.

    15% is still better than any other option in PA currently.

    What would also be nice is if you can Split like a rar the PAF format into a series of extracts so to help when emailing.


  • conexware

    Well, it is actually 32.7 MB in strongest mode 😉

    So 33% ratio

    🙂



  • @spwolf:

    Well, it is actually 32.7 MB in strongest mode 😉

    So 33% ratio 🙂

    Ok, I am very impressed… And I assume when you open it within PA and then view the PDF there is no loss of quality? is this correct?

    I would love to get this trialed out! I will need to speak to Martin at Modern Printing designs he is over in hong kong withi is boss Cindy Cui. However, when i get a working copy of this format on his return i will install my copy of PA and ask him to try it out.

    He uses Outlook so in theory i would need it to auto paf each email with POAP and many of those features i requesetd for POAP V3 were relating to my and other clients experiences…

    🙂 theres a reason i make requets! and 9 times out of 10 they pay off!


  • conexware

    It will take a while to have it in shipping product.

    Yes of vourse, it is completely lossless.



  • Ok, great!

    Shame could throw up some real keen testing enviroments. However, if i have anymore examples of files to compress i will post them here for you.

    Thanks fella.


  • conexware

    More would be better… Real world examples…
    You will obviously have it once it is ready for alpha testing.

    PDF, PNG, DOCX/PDTX/XLXX/OODT, etc.


  • conexware

    moved to general forums for more testing examples.


  • Banned

    How’s the speed of the compression engine for the new format – both compression and extraction?

    Can we build self-extracting files from a .PAF archive?


  • conexware

    @Socrates:

    How’s the speed of the compression engine for the new format – both compression and extraction?

    Can we build self-extracting files from a .PAF archive?

    right now we are still working on codecs, speed for strongest mode (in results above) should be similar to 7zip strongest mode…

    as to the SFX, SFX is just an small app that extracts archive inside, so it is possible.


  • Banned

    That’s very good news indeed. Obviously better compression is good. Nice to know, though, that compression and extraction are still speedy.


  • conexware

    @Socrates:

    That’s very good news indeed. Obviously better compression is good. Nice to know, though, that compression and extraction are still speedy.

    it will have to be a bit slower but overall, it is still fast.
    Extra compression is not due to utilizing slower compression methods (which exist and can be used, but are considerably slower to extract, up to 10x), but by doing better work on preparing data to be compressed.

    In this case, this is actually recompression of deflate codec, which is used to make PDF, PNG, DOCX, OODT files smaller. However, just like in zip, it is inferior compression codec and since it is compression codec after-all, you cant significantly compress it further. So what we had to do is create re-def codec that opens up this data so it can be compressed better.

    Hard part is to put it back together lossless, and still maintain good compression ratio, and make it all fast and seamless experience to the end user.


  • conexware

    ok, so no tasty PDFs to test? C’mn guys… we have just finished with the sample that Richard sent over. We need more.

    There is nothing magical here, just a LOT of work.

    Once again, how everything works is that we recompress deflate codec, used in png, pdf, zip, docx, oodt, etc.

    So (simplified):
    1. Extract deflate tokens from file
    2. Compress them with better method (plzma currently)
    3. Store differences between our deflate implementation and original file so we can have 100% match.
    4. done.

    Basically main “problem” here is that everyone uses different deflate implementation. You know that if you ever zipped something with PA, WZ, WR, 7z and any other tool, you will never end up with same file sizes.

    This is the problem during “reflate” because you cant arrive at 100% same file, you have to use differentials, just like what patchbeam uses for instance to sees the difference between new and old file.

    So we need more interesting examples so we can tune our detection models so there will be minimal amount of diffs, and file size will be as small as possible… Lots of diffs and then we gain no compression.

    For instance, PartyLite.pdf that SirRichard sent:
    Original: 51 MB
    PLZMA: 46 MB
    Ver1 of reflate: 44 MB
    Perfect reflate: 38 MB

    So our first versions of reflate had 6 MB of differentials needed to re-created PDF perfectly… We improved that so it ended being 100kb or so and now we do 38 MB on that file.

    That is before pjpeg (another codec) that recompresses the jpegs inside that PDF, so it ends up being the great 33 MB.



  • I have been asking, unfortunetly when it comes to work we dont have daft Large PDF’s that would benefit from Compression.

    My Photo person doesnt use PNG either. However, my partner who does Partlite on the side as she likes the product so I can obtain many PDF’s containing many images if you woud like more I will post them on my LiveDrive?


  • conexware

    they dont have to be particularly big… 30% less is 30% less. Just authentic.


  • conexware

    here are the results from independent testing of our test versions of reflate, our first codec for paf.

    http://www.squeezechart.com/reflate0rc1.html
    http://www.squeezechart.com/reflate-pdf.html

    This is without specific jpg codec included (so pdf’s that contain jpg will not be recompressed to smallest size), and without detection mode for level of recompression needed, which will correct those docx results for instance.

    very encouraging 🙂



  • Indeed!! allot of promise there! looking forward to it being available for us Toolbox power users 🙂


 

6
Online

9.8k
Users

6.0k
Topics

36.6k
Posts

Copyright © 1998-2018 ConeXware, Inc.
All rights reserved. Privacy Policy