One feature most archive formats are lacking is archive integrity/recovery. RAR has had this capability from the start and thus remains the format of choice for those concerned with data loss from events such as storage media bit rot: https://en.wikipedia.org/wiki/Data_degradation
It would be a shame to not use this opportunity to also incorporate data integrity.
PAR1 and PAR2 specifications are available on the net; however, I’m not sure where to find the actual PAR3 specification, but maybe you could start here: https://www.livebusinesschat.com/smf/index.php?topic=2118.0 and here: https://multipar.eu/
PA needs a huge push on the deduplication department.
Further improvements to rep-fma will help but given that rar for example can completely destroy PA for some sets of data it would be a good idea to consider other options.
Hash based deduplication like rar5 can do could be a very good way to ensure no complete duplicate files are stored twice with almost complete disregard for dictionary sizes or how far apart they are.
Sorting the data before storing it could also be a good idea, compress all the files with the same extension secuentially, given that they are more likely to contain similar data.
these kinds of optimizations are all in all relatively cheap to do in terms of run time but can help for deduplication dramatically in cases plzma4 and rep-fma are failing
Thread for Jpeg codec for advanced codec pack / .pa format.
We already did our jpeg codec 6-7 years ago, maybe @eugene remembers what time exactly :). I believe our initial numbers were that it got compression gain of around 18% (vs 22%-24% of competition) but was 2x-3x faster than WinZip jpeg even in single thread state.
So @eugene gene started rewriting it earlier this year. It will be fun to see what specs we end up with.