- 27 Apr, 2021 1 commit
-
-
David Monniaux authored
-
- 22 Apr, 2021 2 commits
-
-
Léo Gourdin authored
-
Léo Gourdin authored
-
- 16 Apr, 2021 1 commit
-
-
Sylvain Boulmé authored
-
- 13 Apr, 2021 5 commits
-
-
Cyril SIX authored
-
Cyril SIX authored
-
Cyril SIX authored
-
Cyril SIX authored
It only works correctly if both profiling and static prediction are used: it then compares both and gives stats in COMPCERT_PREDICT_STATS file. The stats are of the form: total correct mispredicts missed total = number of total CBs encountered correct = number of correct predictions mispredicts = times when static prediction did a wrong guess (predicted the opposite from profiling, or predicted Some _ when profiling said None) missed = times when static prediction was not able to give a verdict, though the profiling gave one
-
Cyril SIX authored
-
- 12 Apr, 2021 3 commits
-
-
David Monniaux authored
-
David Monniaux authored
-
David Monniaux authored
-
- 09 Apr, 2021 6 commits
-
-
Léo Gourdin authored
-
Léo Gourdin authored
-
Léo Gourdin authored
-
Léo Gourdin authored
-
Léo Gourdin authored
-
Léo Gourdin authored
-
- 08 Apr, 2021 1 commit
-
-
Léo Gourdin authored
-
- 06 Apr, 2021 1 commit
-
-
Léo Gourdin authored
-
- 02 Apr, 2021 6 commits
- 31 Mar, 2021 1 commit
-
-
Cyril SIX authored
Another remnant of trying to devise a complicated algorithm for a problem that was, in fact, very simple: I just had to check whether the branch was within the loop body. I tested it functionally on the benchmarks: only heapsort is changed, in slightly worst (4-5%), because the old get_loop_info had done a buggy guess that proved to be lucky for that particular case. The other benchmarks are unchanged: the predictions stay the exact same. The get_loop_info could potentially be improved by having a natural loop detection that extends to outer loops (not just inner loops), though I expect the performance improvements would be very small.
-
- 30 Mar, 2021 4 commits
-
-
Cyril SIX authored
-
Léo Gourdin authored
-
Léo Gourdin authored
-
Léo Gourdin authored
-
- 29 Mar, 2021 2 commits
-
-
Cyril SIX authored
While I was developing the new trace linearize, I started off with implementing a big algorithm reasoning on dependencies etc.., but I realized later that it was giving a too different performance (sometimes better, sometimes worst) than the original CompCert. So I stripped it off gradually until its performance (on regular code with just branch prediction) was on par with the base Linearize of CompCert. I was aiming here for something that is either equal, or better, in terms of performance. My (then and current) theory is that I have stripped it out so much that now it's just like the algorithm of CompCert, but with a modification for Lcond instructions (see the new linearize_aux_cb). However, I never tested that theory: the code worked, so I left it as is, without any simplification. But now that I need to get a clear version for my manuscript, I'm digging into it. It turns out my theory is not really exact. A difference is that instead of taking the minpc across the chain, I take the pc of the very first block of the chain I create. This was (I think) out of laziness in the middle of two iterations, except that I forgot about it. I tested my new theory by deleting all the stuff about dependencies calculation (commited), and also computing a minpc just like original compcert (not commited): I get the same exact Mach code than linearize_aux_cb. So right now, the only difference between linearize_aux_cb and linearize_aux_trace is that slightly different minpc computation. I think transitionning to linearize_aux_cb will be 1) much clearer than this Frankenstein monster of linearize_aux_trace that I made, and 2) might be better performing too. I don't have access to Kalray machines today so i'm leaving this on hold for now, but tomorrow I will test performance wise to see if there is a regression. If there isn't, I will commit this (and it will be the version narrated by my manuscript). If there is a regression, it would mean selecting the pc of the first node (in opposition to the minpc) is more performant, so i'd backtrack the change to linearize_aux_cb anyway and there should then be 0 difference in the generated code.
-
Léo Gourdin authored
-
- 26 Mar, 2021 4 commits
-
-
Léo Gourdin authored
-
Léo Gourdin authored
-
Léo Gourdin authored
-
Léo Gourdin authored
-
- 23 Mar, 2021 1 commit
-
-
Léo Gourdin authored
-
- 21 Mar, 2021 1 commit
-
-
Léo Gourdin authored
-
- 10 Mar, 2021 1 commit
-
-
Léo Gourdin authored
-