M. Popov, C. Akel, W. Jalby, and P. D. Castro, Piecewise Holistic Autotuning of Compiler and Runtime Parameters, Euro-Par 2016 Parallel Processing -22nd International Conference, pp.238-250, 2016.
DOI : 10.1145/1755888.1755903

URL : https://hal.archives-ouvertes.fr/hal-01417211

C. Lattner and V. Adve, Llvm: A compilation framework for lifelong program analysis & transformation In: Code Generation and Optimization, CGO 2004. International Symposium on, pp.75-86, 2004.

T. Kisuki, P. Knijnenburg, M. Oboyle, and H. Wijshoff, Iterative compilation in program optimization, Proc. CPC10 (Compilers for Parallel Computers, pp.35-44, 2000.

A. Mazouz and D. Barthou, Performance evaluation and analysis of thread pinning strategies on multi-core platforms: Case study of SPEC OMP applications on intel architectures, 2011 International Conference on High Performance Computing & Simulation, pp.273-279, 2011.
DOI : 10.1109/HPCSim.2011.5999834

URL : https://hal.archives-ouvertes.fr/inria-00636845

B. Rountree, D. K. Lownenthal, B. R. De-supinski, M. Schulz, V. W. Freeh et al., Adagio, Proceedings of the 23rd international conference on Conference on Supercomputing, ICS '09, pp.460-469, 2009.
DOI : 10.1145/1542275.1542340

S. Triantafyllis, M. Vachharajani, N. Vachharajani, and D. I. August, Compiler optimization-space exploration, International Symposium on Code Generation and Optimization, 2003. CGO 2003., pp.204-215, 2003.
DOI : 10.1109/CGO.2003.1191546

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.131.1622

S. R. Ladd, Acovea: Analysis of compiler options via evolutionary algorithm, 2007.

K. D. Cooper, P. J. Schielke, and D. Subramanian, Optimizing for reduced code space using genetic algorithms, ACM SIGPLAN Notices, vol.34, issue.7, pp.1-9, 1999.
DOI : 10.1145/315253.314414

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.13.1586

K. Hoste and L. Eeckhout, Cole, Proceedings of the sixth annual IEEE/ACM international symposium on Code generation and optimization , CGO '08, pp.165-174, 2008.
DOI : 10.1145/1356058.1356080

P. De-oliveira-castro, E. Petit, A. Farjallah, and W. Jalby, Adaptive sampling for performance characterization of application kernels, Concurrency and Computation: Practice and Experience, vol.1, issue.1, pp.2345-2362, 2013.
DOI : 10.1109/SC.2010.2

URL : https://hal.archives-ouvertes.fr/hal-00952288

G. Fursin, Milepost GCC: Machine Learning Enabled Self-tuning Compiler, International Journal of Parallel Programming, vol.16, issue.2?3, pp.296-327, 2011.
DOI : 10.1088/1742-6596/16/1/071

URL : https://hal.archives-ouvertes.fr/hal-00685276

P. De-oliveira-castro, C. Akel, E. Petit, M. Popov, and W. Jalby, Cere: Llvm-based codelet extractor and replayer for piecewise benchmarking and optimization, ACM Transactions on Architecture and Code Optimization (TACO), vol.12, issue.1 6, 2015.

M. Popov, C. Akel, F. Conti, W. Jalby, and P. De-oliveira-castro, PCERE: Fine-Grained Parallel Benchmark Decomposition for Scalability Prediction, 2015 IEEE International Parallel and Distributed Processing Symposium, pp.1151-1160, 2015.
DOI : 10.1109/IPDPS.2015.19

URL : https://hal.archives-ouvertes.fr/hal-01417304

R. E. Kessler, M. D. Hill, and D. A. Wood, A comparison of trace-sampling techniques for multi-megabyte caches, IEEE Transactions on Computers, vol.43, issue.6, pp.664-675, 1994.
DOI : 10.1109/12.286300

L. Kaufman and P. J. Rousseeuw, Finding groups in data: an introduction to cluster analysis, 2009.
DOI : 10.1002/9780470316801

F. Agakov, E. Bonilla, J. Cavazos, B. Franke, G. Fursin et al., Using Machine Learning to Focus Iterative Optimization, International Symposium on Code Generation and Optimization (CGO'06), pp.295-305, 2006.
DOI : 10.1109/CGO.2006.37

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.112.2976

A. S. Charif-rubial, E. Oseret, J. Noudohouenou, W. Jalby, and G. Lartigue, CQA: A code quality analyzer tool at binary level, 2014 21st International Conference on High Performance Computing (HiPC), pp.1-10, 2014.
DOI : 10.1109/HiPC.2014.7116904

J. Treibig, G. Hager, and G. Wellein, LIKWID: A Lightweight Performance-Oriented Tool Suite for x86 Multicore Environments, 2010 39th International Conference on Parallel Processing Workshops, pp.207-216, 2010.
DOI : 10.1109/ICPPW.2010.38

URL : http://arxiv.org/abs/1004.4431

J. H. Ward, Hierarchical Grouping to Optimize an Objective Function, Journal of the American Statistical Association, vol.58, issue.301, pp.236-244, 1963.
DOI : 10.1007/BF02289263

R. Thorndike, Who belongs in the family?, Psychometrika, vol.18, issue.4, pp.267-276, 1953.
DOI : 10.1007/BF02289263

P. De-oliveira-castro, Y. Kashnikov, C. Akel, M. Popov, and W. Jalby, Fine-grained Benchmark Subsetting for System Selection, Proceedings of Annual IEEE/ACM International Symposium on Code Generation and Optimization, CGO '14, p.132, 2014.
DOI : 10.1145/2581122.2544144

URL : https://hal.archives-ouvertes.fr/hal-00952256

Y. Chen, Y. Huang, L. Eeckhout, G. Fursin, L. Peng et al., Evaluating iterative optimization across 1000 data sets, Proceedings of the ACM SIGPLAN 2010 Conference on Programming Language Design and Implementation (PLDI'10), 2010.
DOI : 10.1145/1809028.1806647

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.188.4481

M. Curtis-maury, F. Blagojevic, C. D. Antonopoulos, and D. S. Nikolopoulos, Prediction-Based Power-Performance Adaptation of Multithreaded Scientific Codes, IEEE Transactions on Parallel and Distributed Systems, vol.19, issue.10, pp.1396-1410, 2008.
DOI : 10.1109/TPDS.2007.70804

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.113.6395

D. Bailey, The NAS parallel benchmarks---summary and preliminary results, Proceedings of the 1991 ACM/IEEE conference on Supercomputing , Supercomputing '91, pp.158-165, 1991.
DOI : 10.1145/125826.125925

M. Popov, NAS 3.0 C OpenMP. http://benchmark-subsetting.github.io/cNPB 27. Baysal, E.: Reverse time migration, Geophysics, vol.48, issue.11, p.1514, 1983.

L. G. Martins, R. Nobre, J. M. Cardoso, A. C. Delbem, and E. Marques, Clusteringbased selection for the exploration of compiler optimization sequences, ACM Transactions on Architecture and Code Optimization (TACO), vol.13, issue.8, 2016.

A. H. Ashouri, G. Mariani, G. Palermo, E. Park, J. Cavazos et al., COBAYN, ACM Transactions on Architecture and Code Optimization, vol.13, issue.2, 2016.
DOI : 10.1002/j.1556-6678.2002.tb00167.x

P. A. Kulkarni, D. B. Whalley, G. S. Tyson, and J. W. Davidson, In search of near-optimal optimization phase orderings, ACM SIGPLAN Notices, vol.41, issue.7, pp.83-92, 2006.
DOI : 10.1145/1159974.1134663

K. D. Cooper, A. Grosul, T. J. Harvey, S. Reeves, D. Subramanian et al., ACME, ACM SIGPLAN Notices, vol.40, issue.7, pp.69-77, 2005.
DOI : 10.1145/1070891.1065921

S. Purini and L. Jain, Finding good optimization sequences covering program space, ACM Transactions on Architecture and Code Optimization, vol.9, issue.4, p.56, 2013.
DOI : 10.1145/2400682.2400715

L. Eeckhout, J. Sampson, and B. Calder, Exploiting program microarchitecture independent characteristics and phase behavior for reduced benchmark suite simulation, IEEE International. 2005 Proceedings of the IEEE Workload Characterization Symposium, 2005., pp.2-12, 2005.
DOI : 10.1109/IISWC.2005.1525996

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.217.3636

T. Sherwood, E. Perelman, and B. Calder, Basic block distribution analysis to find periodic behavior and simulation points in applications, Proceedings 2001 International Conference on Parallel Architectures and Compilation Techniques, pp.3-14, 2001.
DOI : 10.1109/PACT.2001.953283

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.18.7813

T. E. Carlson, W. Heirman, K. Van-craeynest, and L. Eeckhout, BarrierPoint: Sampled simulation of multi-threaded applications, 2014 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS), 2014.
DOI : 10.1109/ISPASS.2014.6844456

G. Fursin, A. Cohen, M. Oboyle, and O. Temam, Quick and Practical Run-Time Evaluation of Multiple Program Optimizations, pp.34-53, 2007.
DOI : 10.1109/SC.1998.10004

URL : https://hal.archives-ouvertes.fr/inria-00084110

P. A. Kulkarni, M. R. Jantz, and D. B. Whalley, Improving both the performance benefits and speed of optimization phase sequence searches, ACM SIGPLAN Notices, vol.45, issue.4, pp.95-104, 2010.
DOI : 10.1145/1755951.1755903

C. Liao, D. J. Quinlan, R. Vuduc, and T. Panas, Effective Source-to-Source Outlining to Support Whole Program Empirical Optimization, Languages and Compilers for Parallel Computing, pp.308-322, 2010.
DOI : 10.1007/978-3-642-13374-9_21

C. Akel, Y. Kashnikov, P. De-oliveira-castro, and W. Jalby, Is Source-Code Isolation Viable for Performance Characterization?, 2013 42nd International Conference on Parallel Processing, 2013.
DOI : 10.1109/ICPP.2013.116

URL : https://hal.archives-ouvertes.fr/hal-00952290