20211004, 20:58  #1 
"Evan"
Dec 2020
Montreal
71 Posts 
Optimization and selection of the top scoring polynomial from a larger set
I'm working on a bash script to facilitate optimization and selection of the top scoring polynomial from a larger set (i.e. the msieve.dat.ms file from running msieve np1 nps). I have the actual optimization process done, but I'm still trying to figure out what the optimal choices for sopteffort and ropteffort based on the composite's size. I think that VBCurtis' optimized CADO parameters would be useful as a first reference, but those only have ropteffort values. Does anyone have an idea of what the scaling should be for sopteffort, as well as for ropteffort?
I'm attaching my current script to this post. The only requirement right now is that you have an input file that will work with CADO's sopt binary. My current plans are:  figure out a way to convert msieve's nps output into CADOcompatible polynomials  split the sopt effort over multiple threads (currently only ropt has a threads parameter)  implementing duplicate polynomial removal before every step Feedback/advice is appreciated, thank you! P.S. I'm not sure if this belongs here or if it should be moved to the programming subforum instead. Last fiddled with by EdH on 20211006 at 12:08 Reason: forgot something! 
20211004, 21:23  #2  
"Ed Hall"
Dec 2009
Adirondack Mtns
4,003 Posts 
Quote:
I think this thread might be better in the "Factoring" subforum, only because it involves more than just CADONFS, but let's leave it here for now and if the time comes to move it, I can do so. 

20211004, 22:20  #3 
"Curtis"
Feb 2005
Riverside, CA
2^{2}·1,249 Posts 
Below 145 digits, I find that using sopt in a CADO poly select is slower than using that time to search a wider adrange.
I've just recently begun trying to determine when sopt=1 provides a benefit greater than its time cost. By C180+, a high number for sopt like 8 or 10 seems to be worthwhile and perhaps even higher, so there ought to be some kind of scalingup in between. 
20211006, 03:19  #4 
"Evan"
Dec 2020
Montreal
71 Posts 
Got the script into a working state, should be ready for use for anyone!
Unzip the scripts zip into your CADO build/[user]/polyselect directory. All you need in order to run it is a file from either msieve's sizeoptimization output (msieve np1 nps) or already in CADO's poly format. Run the script with ./polyFinder.sh (might need to run chmod +x polyFinder.sh first.) It should just run through all the steps on its own. The default sopteffort and ropteffort are higher, but I'll be fixing them up eventually. They shouldn't necessarily go any higher than what they're currently set to though. Enjoy! P.S. @mods: could the thread be renamed to something more appropriate? (and moved if necessary.) Forgot to actually attach the zip, here it is. Could this be merged into the post above? (Done EdH) Last fiddled with by EdH on 20211006 at 12:12 Reason: OP Request for merge 
20211006, 13:29  #5 
"Ed Hall"
Dec 2009
Adirondack Mtns
4,003 Posts 
I d/l'd your latest and am having some difficulties getting it set up.
First, I had to install pv and gawk. They were not defaults in my Ubuntu, but were readily available. I am totally unfamiliar with pv, but it looks like a pipe monitoring tool. Is this supplying the "0:00:00 [76.9k/s] [>..." line? I then, first tried an Msieve polynomial: Code:
n: 346779657259791246313554044276401199475050208141387409790076498969471905612322027394740984520613397049141959352962783652552620174555843715898338877709473791695077522065984368363437509 R0: 633156115026647924966242353927160849 R1: 1838892199331450047 A0: 4055358059413560189793755056163937883873351483520 A1: 16581144338788276414460714496465421079736 A2: 636936169159140542086060941338 A3: 29838939405307828973869 A4: 3553046772992 A5: 3408 skew 1367091201.68 # size 6.377e018, alpha 8.529, combined = 5.338e014 rroots = 5 Code:
n: 346779657259791246313554044276401199475050208141387409790076498969471905612322027394740984520613397049141959352962783652552620174555843715898338877709473791695077522065984368363437509 Y0: 633156115026647924966242353927160849 Y1: 1838892199331450047 c0: 4055358059413560189793755056163937883873351483520 c1: 16581144338788276414460714496465421079736 c2: 636936169159140542086060941338 c3: 29838939405307828973869 c4: 3553046772992 c5: 3408 skew: 1367091201.68 # size 6.377e018, alpha 8.529, combined = 5.338e014 rroots = 5 Code:
$ bash polyFinder.sh Enter the number being factored: 346779657259791246313554044276401199475050208141387409790076498969471905612322027394740984520613397049141959352962783652552620174555843715898338877709473791695077522065984368363437509 What file are your polynomials stored in? polys How many threads will you be using? 24 What degree polynomial are you optimizing? (46)5 How many polynomials would you like to run through root optimization? 12 Invalid input, attempting to reformat. Parse error: parameter for key c0 is not an mpz: 0:00:00 [76.9k/s] [> ] 1% ETA 09:09:35 Sizeoptimization completed with (estimated) 0 polynomials. Beginning rootoptimization with best 12 polynomials. # Warning: parameter B is checked by this program but is undocumented. # Warning: parameter A is checked by this program but is undocumented. 0:00:00 [1.15k/s] [ <=> ] Best polynomial has been saved to the file: bestpoly # (8ab2eea72) ./polyselect_ropt inputpolys toppolys ropteffort 100 t 24 # Compiled with gcc 9.3.0 # Compilation flags (C) std=c99 g W Wall O2 msse3 mssse3 msse4.1 mpopcnt mavx mpclmul # Compilation flags (C++) exportdynamic std=c++11 Wnoc++11compat g W Wall O2 Wnoliteralsuffix msse3 mssse3 msse4.1 mpopcnt mavx mpclmul # Info: Will use 24 threads # Info: ropteffort = 100 # Info: L1_cachesize = 32768, size_tune_sievearray = 16384 # Reading polynomials from toppolys # 0 polynomial(s) read. # Info: Using OpenMP with 24 thread(s) # Stat: total phase took 0.19s # Stat: rootsieve took 0.00s # Stat: (stage 1 took 0.00s) # Stat: (tuning took 0.00s) # Stat: (stage 2 (sieving) took 0.00s) # WARNING: No polynomials were found in the input file toppolys 
20211006, 14:26  #6  
"Evan"
Dec 2020
Montreal
71 Posts 
Quote:


20211006, 15:16  #7  
"Ed Hall"
Dec 2009
Adirondack Mtns
4003_{10} Posts 
Quote:
Code:
n: 346779657259791246313554044276401199475050208141387409790076498969471905612322027394740984520613397049141959352962783652552620174555843715898338877709473791695077522065984368363437509 Y0: 633156115026647924966242353927160849 Y1: 1838892199331450047 c0: 4055358059413560189793755056163937883873351483520 c1: 16581144338788276414460714496465421079736 c2: 636936169159140542086060941338 c3: 29838939405307828973869 c4: 3553046772992 c5: 3408 skew: 1367091201.68 # size 6.377e018, alpha 8.529, combined = 5.338e014 rroots = 5 File is valid sopt input. 0:00:01 [36.9 /s] [==============================] 133% ETA 11:03:16 Sizeoptimization completed with (estimated) 1 polynomials. Beginning rootoptimization with best 12 polynomials. # Warning: parameter B is checked by this program but is undocumented. # Warning: parameter A is checked by this program but is undocumented. 08:19 [46.1m/s] [=====================> ] 74% ETA 0:02:53 ETA 11:14:29 The last line seems a bit misleading. It got to 74% and then just started clocking the time values upward. 

20211006, 15:41  #8  
"Evan"
Dec 2020
Montreal
107_{8} Posts 
Quote:
The undocumented lines are output from CADO's ropt binary, they output to stderr. You can replace the line that calls ropt with this to hide them. ./polyselect_ropt inputpolys toppolys ropteffort 100 t "$THREADS" 2>/dev/null  pv pltIaes $(($(wc l < toppolys) * 21 / 8)) > ropt.out If you want a list to test the script on, here's one. https://transfer.sh/KagkW6/msieve.dat.ms. deg 6, head 1 msieve.dat.ms for composite. Last fiddled with by Plutie on 20211006 at 16:20 Reason: added example file 

20211006, 16:38  #9 
"Ed Hall"
Dec 2009
Adirondack Mtns
4,003 Posts 
Success!
It completed with a better poly than supplied: Code:
Best polynomial has been saved to the file: bestpoly # Stat: rootsieve took 1625.22s # Stat: (stage 1 took 266.42s) # Stat: (tuning took 1317.08s) # Stat: (stage 2 (sieving) took 41.04s) # Best polynomial found (revision 8ab2eea72): # n: 346779657259791246313554044276401199475050208141387409790076498969471905612322027394740984520613397049141959352962783652552620174555843715898338877709473791695077522065984368363437509 # Y0: 633156115434329678267970664676644305 # Y1: 1838892199331450047 # c0: 942416842833037178008794982721825660284661427968 # c1: 12659775140765359789715754874236115108152 # c2: 17789742230088331244638103557030 # c3: 25013045517601501098285 # c4: 7330808774912 # c5: 3408 # skew: 1038278114.355 # # lognorm 59.22, E 50.34, alpha 8.88 (proj 1.40), 5 real roots # # MurphyE(Bf=1.000e+07,Bg=5.000e+06,area=1.000e+16)=5.688e14 # Average exp_E: 51.00, average E: 50.34 For info, the poly I chose for input was from the Polynomial Request Thread, and has some spun versions later in the thread. 
20211007, 03:39  #10 
"Evan"
Dec 2020
Montreal
71 Posts 
"Release" 1.0.1
 Added some more conversion logic, should be compatible with the standard poly format used on this forum now (just make sure it has the n value added manually, I'll figure that out eventually.) 
20211007, 12:42  #11 
"Ed Hall"
Dec 2009
Adirondack Mtns
4,003 Posts 
My script did come up with a better scoring polynomial, but it came up with the same one as yours initially:
Code:
# MurphyE(Bf=1.000e+07,Bg=5.000e+06,area=1.000e+16)=5.882e14 Best poly cownoise values: 1431009352.14992 5.96454364e14 
Thread Tools  
Similar Threads  
Thread  Thread Starter  Forum  Replies  Last Post 
we all make choices  MattcAnderson  MattcAnderson  0  20210816 00:39 
Choices for Manual Assignments  Rodrigo  Information & Answers  67  20190920 06:33 
Please respect my options and choices about email notifications  retina  Forum Feedback  9  20120713 02:03 
Parameter Underestimation  R.D. Silverman  Cunningham Tables  14  20100929 19:56 
ECM Work and Parameter Choices  R.D. Silverman  Cunningham Tables  11  20060306 18:46 