BACK TO INDEX

Publications of year 2015
Thesis
  1. A. Casadei. Optimizations of hybrid sparse linear solvers relying on Schur complement and domain decomposition approaches. PhD thesis, Université de Bordeaux, October 2015. Keyword(s): Sparse.
    Abstract:
    In this thesis, we focus on the parallel solving of large sparse linear systems. Our main interest is on direct-iterative hybrid solvers such as HIPS, MAPHYS, PDSLIN or SHYLU, which rely on domain decomposition and Schur complement approaches. Althrough these solvers are not as time and space consuming as direct methods, they still suffer from serious overheads. In a first part, we thus present the existing techniques for reducing the memory consumption, and we present a new method which does not impact the numerical robustness of the preconditioner. This technique reduces the memory peak by doing a special scheduling of computation, allocation, and freeing tasks in particular in the Schur coupling blocks of the matrix. In a second part, we focus on the load balancing of the domain decomposition in a parallel context. This problem consists in partitioning the adjacency graph of the matrix in as many domains as desired. We point out that a good load balancing for the most expensive steps of an hybrid solver such as MAPHYS relies on the balancing of both interior nodes and interface nodes of the domains. Through, until now, graph partitioners such as METIS or SCOTCH used to optimize only the first criteria (i.e. the balancing of interior nodes) in the context of sparse matrix ordering. We propose different variations of the existing algorithms to improve the balancing of interface nodes and interior nodes simultaneously. All our changes are implemented in the SCOTCH partitioner. We present our results on large collection of matrices coming from real industrial cases.

    @phdthesis{t:LaBRI::AC15,
    TITLE = {{Optimizations of hybrid sparse linear solvers relying on Schur complement and domain decomposition approaches}},
    AUTHOR = {Casadei, A.},
    URL = {https://tel.archives-ouvertes.fr/tel-01228520},
    NUMBER = {2015BORD0186},
    SCHOOL = {{Universit{\'e} de Bordeaux}},
    YEAR = {2015},
    MONTH = Oct,
    KEYWORDS = {Sparse},
    PDF = {https://tel.archives-ouvertes.fr/tel-01228520/file/CASADEI_ASTRID_2015.pdf},
    HAL_ID = {tel-01228520},
    HAL_VERSION = {v1},
    ABSTRACT = { In this thesis, we focus on the parallel solving of large sparse linear systems. Our main interest is on direct-iterative hybrid solvers such as HIPS, MAPHYS, PDSLIN or SHYLU, which rely on domain decomposition and Schur complement approaches. Althrough these solvers are not as time and space consuming as direct methods, they still suffer from serious overheads. In a first part, we thus present the existing techniques for reducing the memory consumption, and we present a new method which does not impact the numerical robustness of the preconditioner. This technique reduces the memory peak by doing a special scheduling of computation, allocation, and freeing tasks in particular in the Schur coupling blocks of the matrix. In a second part, we focus on the load balancing of the domain decomposition in a parallel context. This problem consists in partitioning the adjacency graph of the matrix in as many domains as desired. We point out that a good load balancing for the most expensive steps of an hybrid solver such as MAPHYS relies on the balancing of both interior nodes and interface nodes of the domains. Through, until now, graph partitioners such as METIS or SCOTCH used to optimize only the first criteria (i.e. the balancing of interior nodes) in the context of sparse matrix ordering. We propose different variations of the existing algorithms to improve the balancing of interface nodes and interior nodes simultaneously. All our changes are implemented in the SCOTCH partitioner. We present our results on large collection of matrices coming from real industrial cases. } 
    }
    


  2. X. Lacoste. Scheduling and memory optimizations for sparse direct solver on multi-core/multi-gpu duster systems. PhD thesis, Université de Bordeaux, February 2015. Keyword(s): Sparse.
    Abstract:
    The ongoing hardware evolution exhibits an escalation in the number, as well as in the heterogeneity, of computing resources. The pressure to maintain reasonable levels of performance and portability forces application developers to leave the traditional programming paradigms and explore alternative solutions. PaStiX is a parallel sparse direct solver, based on a dynamic scheduler for modern hierarchical manycore architectures. In this thesis, we study the benefits and the limits of replacing the highly specialized internal scheduler of the PaStiX solver by two generic runtime systems: PaRSEC and StarPU. Thus, we have to describe the factorization algorithm as a tasks graph that we provide to the runtime system. Then it can decide how to process and optimize the graph traversal in order to maximize the algorithm efficiency for the targeted hardware platform. A comparative study of the performance of the PaStiX solver on top of its original internal scheduler, PaRSEC, and StarPU frameworks is performed. The analysis highlights that these generic task-based runtimes achieve comparable results to the application-optimized embedded scheduler on homogeneous platforms. Furthermore, they are able to significantly speed up the solver on heterogeneous environments by taking advantage of the accelerators while hiding the complexity of their efficient manipulation from the programmer. In this thesis, we also study the possibilities to build a distributed sparse linear solver on top of task-based runtime systems to target heterogeneous clusters. To permit an efficient and easy usage of these developments in parallel simulations, we also present an optimized distributed interface aiming at hiding the complexity of the construction of a distributed matrix to the user.

    @phdthesis{t:LaBRI::XL15,
    TITLE = {{Scheduling and memory optimizations for sparse direct solver on multi-core/multi-gpu duster systems}},
    AUTHOR = {Lacoste, X.},
    URL = {https://tel.archives-ouvertes.fr/tel-01222565},
    NUMBER = {2015BORD0016},
    SCHOOL = {{Universit{\'e} de Bordeaux}},
    YEAR = {2015},
    MONTH = Feb,
    KEYWORDS = {Sparse},
    PDF = {https://tel.archives-ouvertes.fr/tel-01222565/file/LACOSTE_XAVIER_2015.pdf},
    HAL_ID = {tel-01222565},
    HAL_VERSION = {v1},
    ABSTRACT = { The ongoing hardware evolution exhibits an escalation in the number, as well as in the heterogeneity, of computing resources. The pressure to maintain reasonable levels of performance and portability forces application developers to leave the traditional programming paradigms and explore alternative solutions. PaStiX is a parallel sparse direct solver, based on a dynamic scheduler for modern hierarchical manycore architectures. In this thesis, we study the benefits and the limits of replacing the highly specialized internal scheduler of the PaStiX solver by two generic runtime systems: PaRSEC and StarPU. Thus, we have to describe the factorization algorithm as a tasks graph that we provide to the runtime system. Then it can decide how to process and optimize the graph traversal in order to maximize the algorithm efficiency for the targeted hardware platform. A comparative study of the performance of the PaStiX solver on top of its original internal scheduler, PaRSEC, and StarPU frameworks is performed. The analysis highlights that these generic task-based runtimes achieve comparable results to the application-optimized embedded scheduler on homogeneous platforms. Furthermore, they are able to significantly speed up the solver on heterogeneous environments by taking advantage of the accelerators while hiding the complexity of their efficient manipulation from the programmer. In this thesis, we also study the possibilities to build a distributed sparse linear solver on top of task-based runtime systems to target heterogeneous clusters. To permit an efficient and easy usage of these developments in parallel simulations, we also present an optimized distributed interface aiming at hiding the complexity of the construction of a distributed matrix to the user. } 
    }
    


  3. S. Moustafa. Massively Parallel Cartesian Discrete Ordinates Method for Neutron Transport Simulation. PhD thesis, Université de Bordeaux, December 2015. Keyword(s): Neutron.
    @phdthesis{t:LaBRI::SM15,
    TITLE = {{Massively Parallel Cartesian Discrete Ordinates Method for Neutron Transport Simulation}},
    AUTHOR = {Moustafa, S.},
    URL = {https://tel.archives-ouvertes.fr/tel-01379686},
    NUMBER = {2015BORD0408},
    SCHOOL = {{Universit{\'e} de Bordeaux}},
    YEAR = {2015},
    MONTH = Dec,
    KEYWORDS = {Neutron} PDF = {https://tel.archives-ouvertes.fr/tel-01379686/file/MOUSTAFA_SALLI_2015.pdf},
    HAL_ID = {tel-01379686},
    HAL_VERSION = {v2},
    
    }
    


Conference articles
  1. A. Casadei and P. Ramet. Towards a recursive graph bipartitioning algorithm for well balanced domain decomposition. In Mini-Symposium on Combinatorial Issues in Sparse Matrix Computation at ICIAM'15 conference, Pekin, China, August 2015. Keyword(s): Sparse.
    Abstract:
    In the context of hybrid sparse linear solvers based on domain decomposition and Schur complement approaches, getting a domain decomposition tool leading to a good balancing of both the internal node set size and the interface node set size is a critical point for parallel computation. We propose several variations of the existing algorithms in the multilevel Scotch partitioner and we illustrate the improved results on a collection of graphs coming from numerical scientific applications.

    @InProceedings{C:LaBRI::iciam15b,
    author = {Casadei, A. and Ramet, P.},
    title = {Towards a recursive graph bipartitioning algorithm for well balanced domain decomposition},
    OPTcrossref = {},
    OPTkey = {},
    booktitle = {Mini-Symposium on "Combinatorial Issues in Sparse Matrix Computation" at ICIAM'15 conference},
    OPTpages = {},
    year = {2015},
    OPTeditor = {},
    OPTvolume = {},
    OPTnumber = {},
    OPTseries = {},
    address = {Pekin, China},
    month = aug,
    OPTorganization = {},
    OPTpublisher = {},
    OPTnote = {},
    OPTannote = {},
    URL = {http://www.labri.fr/~ramet/restricted/iciam2.pdf},
    KEYWORDS = "Sparse",
    ABSTRACT = {In the context of hybrid sparse linear solvers based on domain decomposition and Schur complement approaches, getting a domain decomposition tool leading to a good balancing of both the internal node set size and the interface node set size is a critical point for parallel computation. We propose several variations of the existing algorithms in the multilevel Scotch partitioner and we illustrate the improved results on a collection of graphs coming from numerical scientific applications.} 
    }
    


  2. A. Casadei, P. Ramet, and J. Roman. Towards a recursive graph bipartitioning algorithm for well balanced domain decomposition. In Mini-Symposium on Partitioning for Complex Objectives at SIAM CSE'15 conference, Salt Lake City, USA, March 2015. Keyword(s): Sparse.
    @InProceedings{C:LaBRI::cse15b,
    author = {Casadei, A. and Ramet, P. and Roman, J.},
    title = {Towards a recursive graph bipartitioning algorithm for well balanced domain decomposition},
    OPTcrossref = {},
    OPTkey = {},
    booktitle = {Mini-Symposium on "Partitioning for Complex Objectives" at SIAM CSE'15 conference},
    OPTpages = {},
    year = {2015},
    OPTeditor = {},
    OPTvolume = {},
    OPTnumber = {},
    OPTseries = {},
    address = {Salt Lake City, USA},
    month = mar,
    OPTorganization = {},
    OPTpublisher = {},
    OPTnote = {},
    OPTannote = {},
    KEYWORDS = "Sparse" 
    }
    


  3. M. Faverge, G. Pichon, P. Ramet, and J. Roman. Blocking strategy optimizations for sparse direct linear solver on heterogeneous architectures. In Sparse Days, Saint Girons, France, June 2015. Keyword(s): Sparse.
    Abstract:
    In the context of solving sparse linear systems, an ordering process partitions the matrix graph to minimize both fill-in and computational cost. We found that the ordering strategy used within supernodes might be enhanced to reduce the number of off-diagonal blocks, and then increases block sizes and kernel performance. This turns to be into the same complexity as the factorization algorithm, but allows for more efficient BLAS kernels. On the other side, supernodes that are too large need to be split to create more parallelism. The regular splitting strategy when applied locally impacts significantly the number of off-diagonal blocks and might have negative effect on the efficiency. In this talk, we present both a new strategy to improve supernodes ordering and splitting strategy that both enlarge the off-diagonal block sizes without changing the computational cost of the factorization. Performance improvement gains on the supernodal solver PaStiX are shown on multi-cores and heterogeneous architectures.

    @InProceedings{C:LaBRI::sparsedays2015,
    author = {Faverge, M. and Pichon, G. and Ramet, P. and Roman, J.},
    title = {{Blocking strategy optimizations for sparse direct linear solver on heterogeneous architectures}},
    booktitle = {Sparse Days},
    OPTcrossref = {},
    OPTkey = {},
    OPTpages = {},
    year = 2015,
    OPTeditor = {},
    OPTnumber = {},
    OPTvolume = {},
    OPTseries = {},
    address = {Saint Girons, France},
    month = jun,
    OPTorganization = {},
    OPTpublisher = {},
    OPTnote = {},
    OPTannote = {},
    URL = {http://www.labri.fr/~ramet/restricted/sparsedays15.pdf},
    KEYWORDS = "Sparse",
    ABSTRACT = { In the context of solving sparse linear systems, an ordering process partitions the matrix graph to minimize both fill-in and computational cost. We found that the ordering strategy used within supernodes might be enhanced to reduce the number of off-diagonal blocks, and then increases block sizes and kernel performance. This turns to be into the same complexity as the factorization algorithm, but allows for more efficient BLAS kernels. On the other side, supernodes that are too large need to be split to create more parallelism. The regular splitting strategy when applied locally impacts significantly the number of off-diagonal blocks and might have negative effect on the efficiency. In this talk, we present both a new strategy to improve supernodes ordering and splitting strategy that both enlarge the off-diagonal block sizes without changing the computational cost of the factorization. Performance improvement gains on the supernodal solver PaStiX are shown on multi-cores and heterogeneous architectures. } 
    }
    


  4. M. Faverge, G. Pichon, P. Ramet, and J. Roman. On the use of H-Matrix Arithmetic in PaStiX: a Preliminary Study. In Workshop on Fast Solvers, Toulouse, France, June 2015. Keyword(s): Low-rank compression.
    Abstract:
    When solving large sparse linear systems, both the amount of memory needed and the computational cost represent a burden to efficiency. In order to solve larger systems, low-rank strategies are used to reduce the overall complexity of a solver. In this talk, we present a preliminary study of the use of H-Matrix arithmetic in a supernodal solver. We also present a new feature in PaStiX, a reordering strategy to reduce the number of off-diagonal blocks in the symbolic factorization. It allows BLAS kernels to be more efficient, and those ideas could be explored in the context of a low-rank strategy.

    @InProceedings{C:LaBRI::CIMI15,
    author = {Faverge, M. and Pichon, G. and Ramet, P. and Roman, J.},
    title = {{On the use of H-Matrix Arithmetic in PaStiX: a Preliminary Study}},
    booktitle = {Workshop on Fast Solvers},
    OPTcrossref = {},
    OPTkey = {},
    OPTpages = {},
    year = 2015,
    OPTeditor = {},
    OPTnumber = {},
    OPTvolume = {},
    OPTseries = {},
    address = {Toulouse, France},
    month = jun,
    OPTorganization = {},
    OPTpublisher = {},
    OPTnote = {},
    OPTannote = {},
    URL = {http://www.labri.fr/~ramet/restricted/cimi15.pdf},
    KEYWORDS = "Low-rank compression",
    ABSTRACT = { When solving large sparse linear systems, both the amount of memory needed and the computational cost represent a burden to efficiency. In order to solve larger systems, low-rank strategies are used to reduce the overall complexity of a solver. In this talk, we present a preliminary study of the use of H-Matrix arithmetic in a supernodal solver. We also present a new feature in PaStiX, a reordering strategy to reduce the number of off-diagonal blocks in the symbolic factorization. It allows BLAS kernels to be more efficient, and those ideas could be explored in the context of a low-rank strategy. } 
    }
    


  5. X. Lacoste, M. Faverge, and P. Ramet. A task-based sparse direct solver suited for large scale hierarchical/heterogeneous architectures. In Mini-Symposium on Task-based Scientific Computing Applications at SIAM CSE'15 conference, Salt Lake City, USA, March 2015. Keyword(s): Sparse.
    @InProceedings{C:LaBRI::cse15a,
    author = {Lacoste, X. and Faverge, M. and Ramet, P.},
    title = {A task-based sparse direct solver suited for large scale hierarchical/heterogeneous architectures},
    OPTcrossref = {},
    OPTkey = {},
    booktitle = {Mini-Symposium on "Task-based Scientific Computing Applications" at SIAM CSE'15 conference},
    OPTpages = {},
    year = {2015},
    OPTeditor = {},
    OPTvolume = {},
    OPTnumber = {},
    OPTseries = {},
    address = {Salt Lake City, USA},
    month = mar,
    OPTorganization = {},
    OPTpublisher = {},
    OPTnote = {},
    OPTannote = {},
    KEYWORDS = "Sparse" 
    }
    


  6. S. Moustafa, M. Faverge, L. Plagne, and P. Ramet. 3D Cartesian Transport Sweep for Massively Parallel Architectures with PARSEC. In 29th IEEE International Parallel & Distributed Processing Symposium, IPDPS'15, Hyderabad, India, pages 581-590, May 2015. ISSN: 1530-2075. Keyword(s): Neutron.
    @inproceedings{moustafa:hal-01078362,
    TITLE = {{3D Cartesian Transport Sweep for Massively Parallel Architectures with PARSEC}},
    AUTHOR = {Moustafa, S. and Faverge, M. and Plagne, L. and Ramet, P.},
    URL = {https://hal.inria.fr/hal-01078362},
    BOOKTITLE = {{29th IEEE International Parallel \& Distributed Processing Symposium, IPDPS'15}},
    ADDRESS = {Hyderabad, India},
    YEAR = {2015},
    MONTH = May,
    PAGES = {581-590},
    DOI = {10.1109/IPDPS.2015.75},
    ISSN={1530-2075},
    KEYWORDS = "Neutron",
    HAL_ID = {hal-01078362},
    HAL_VERSION = {v1},
    
    }
    


  7. G. Pichon, A. Haidar, M. Faverge, and J. Kurzak. Divide and Conquer Symmetric Tridiagonal Eigensolver for Multicore Architectures. In IEEE International Parallel & Distributed Processing Symposium (IPDPS 2015), Hyderabad, India, May 2015.
    @inproceedings{pichon:hal-01078356,
    TITLE = {{Divide and Conquer Symmetric Tridiagonal Eigensolver for Multicore Architectures}},
    AUTHOR = {Pichon, G. and Haidar, A. and Faverge, M. and Kurzak, J.},
    URL = {https://hal.inria.fr/hal-01078356},
    BOOKTITLE = {{IEEE International Parallel \& Distributed Processing Symposium (IPDPS 2015)}},
    ADDRESS = {Hyderabad, India},
    YEAR = {2015},
    MONTH = May,
    PDF = {https://hal.inria.fr/hal-01078356/file/dnc_final.pdf},
    HAL_ID = {hal-01078356},
    HAL_VERSION = {v3},
    
    }
    


  8. P. Ramet. On the design of parallel linear solvers for large scale problems. In Mini-Symposium on Recent advances in matrix computations for extreme-scale computers at ICIAM'15 conference, Pekin, China, August 2015. Keyword(s): Sparse.
    Abstract:
    In this talk we will discuss our research activities on the design of parallel linear solvers for large scale problems that range from dense linear algebra, to parallel sparse direct solver and hybrid iterative-direct approaches. In particular we will describe the implementations designed on top of runtime systems that should provide both code and performance portabilities. Finally, we will present some preliminary results on the integration of h-matrice kernels in our sparse direct solver framework.

    @InProceedings{C:LaBRI::iciam15a,
    author = {Ramet, P.},
    title = {On the design of parallel linear solvers for large scale problems},
    OPTcrossref = {},
    OPTkey = {},
    booktitle = {Mini-Symposium on "Recent advances in matrix computations for extreme-scale computers" at ICIAM'15 conference},
    OPTpages = {},
    year = {2015},
    OPTeditor = {},
    OPTvolume = {},
    OPTnumber = {},
    OPTseries = {},
    address = {Pekin, China},
    month = aug,
    OPTorganization = {},
    OPTpublisher = {},
    OPTnote = {},
    OPTannote = {},
    URL = {http://www.labri.fr/~ramet/restricted/iciam1.pdf},
    KEYWORDS = "Sparse",
    ABSTRACT = {In this talk we will discuss our research activities on the design of parallel linear solvers for large scale problems that range from dense linear algebra, to parallel sparse direct solver and hybrid iterative-direct approaches. In particular we will describe the implementations designed on top of runtime systems that should provide both code and performance portabilities. Finally, we will present some preliminary results on the integration of h-matrice kernels in our sparse direct solver framework.} 
    }
    


Internal reports
  1. M. Alaya, M. Faverge, X. Lacoste, A. Péré-Laperne, J. Péré-Laperne, P. Ramet, and T. Terraz. Simul'Elec and PASTIX interface specifications. Technical Report RT-0458, INRIA Bordeaux ; AlgoTech, April 2015.
    @techreport{alaya:hal-01142204,
    TITLE = {{Simul'Elec and PASTIX interface specifications}},
    AUTHOR = {Alaya, M. and Faverge, M. and Lacoste, X. and P{\'e}r{\'e}-Laperne, A. and P{\'e}r{\'e}-Laperne, J. and Ramet, P. and Terraz, T.},
    URL = {https://hal.inria.fr/hal-01142204},
    TYPE = {Technical Report},
    NUMBER = {RT-0458},
    INSTITUTION = {{INRIA Bordeaux ; AlgoTech}},
    YEAR = {2015},
    MONTH = Apr,
    PDF = {https://hal.inria.fr/hal-01142204/file/RT-458.pdf},
    HAL_ID = {hal-01142204},
    HAL_VERSION = {v1},
    
    }
    


  2. M. Faverge, X. Lacoste, P. Ramet, and T. Terraz. Etude de la factorisation directe hétérogène et de la factorisation incomplète sur solveur PaStiX appliquées à des systèmes issus de problèmes du CEA/CESTA. Technical report, C.E.A. / C.E.S.T.A, 2015. Note: Rapport Final. Keyword(s): Sparse.
    @TechReport{f:LaBRI::cesta15,
    author = "Faverge, M. and Lacoste, X. and Ramet, P. and Terraz, T.",
    title = "Etude de la factorisation directe h\'et\'erog\`ene et de la factorisation incompl\`ete sur solveur PaStiX appliqu\'ees \`a des syst\`emes issus de probl\`emes du CEA/CESTA",
    institution = "C.E.A. / C.E.S.T.A",
    year = "2015",
    note = "Rapport Final",
    KEYWORDS = "Sparse" 
    }
    


Miscellaneous
  1. H. Beaugendre, L. Lestandi, and P. Ramet. Benchmarking of the linear solver PaStiX for integration in LESCAPE. Internship Inria, March 2015.
    @Misc{c:LaBRI::lescape,
    OPTkey = {},
    author = {Beaugendre, H. and Lestandi, L. and Ramet, P.},
    title = {{Benchmarking of the linear solver PaStiX for integration in LESCAPE}},
    howpublished = {Internship Inria},
    month = mar,
    year = 2015,
    OPTnote = {},
    OPTannote = {},
    URL = {http://www.labri.fr/~ramet/restricted/lescape.pdf},
    
    }
    


  2. M. Faverge, X. Lacoste, and P. Ramet. PaStiX: Parallel Sparse Matrix Package. JDEV2015 : Journées Développement Logiciel, July 2015.
    @Misc{c:LaBRI::JDEV,
    OPTkey = {},
    author = {Faverge, M. and Lacoste, X. and Ramet, P.},
    title = {{PaStiX: Parallel Sparse Matrix Package}},
    howpublished = {JDEV2015 : Journ\'ees D\'eveloppement Logiciel},
    month = jul,
    year = 2015,
    OPTnote = {},
    OPTannote = {},
    URL = {http://www.labri.fr/~ramet/restricted/jdev.pdf},
    
    }
    


  3. M. Faverge, G. Pichon, P. Ramet, and J. Roman. Blocking strategy optimization for sparse direct linear solvers on heterogeneous architectures. SOLHAR meeting, Lyon, France, June 2015. Keyword(s): Sparse.
    @Misc{c:LaBRI::pastix-solhar3,
    OPTkey = {},
    author = {Faverge, M. and Pichon, G. and Ramet, P. and Roman, J.},
    title = {Blocking strategy optimization for sparse direct linear solvers on heterogeneous architectures},
    howpublished = {SOLHAR meeting, Lyon, France},
    month = jun,
    year = 2015,
    OPTnote = {},
    OPTannote = {},
    URL = {http://www.labri.fr/~ramet/restricted/pastix-solhar3.pdf},
    KEYWORDS = "Sparse" 
    }
    


  4. M. Faverge, G. Pichon, P. Ramet, and J. Roman. Blocking strategy optimizations for sparse direct linear solver on heterogeneous architectures. Workshop INRIA-CNPq, HOSCAR meeting, Sophia-Antipolis, France, September 2015. Keyword(s): Sparse.
    Abstract:
    In the context of solving sparse linear systems, an ordering process partitions the matrix graph to minimize both fill-in and computational cost. We found that the ordering strategy used within supernodes might be enhanced to reduce the number of off-diagonal blocks, and then increases block sizes and kernel performance. This turns to be into the same complexity as the factorization algorithm, but allows for more efficient BLAS kernels. On the other side, supernodes that are too large need to be split to create more parallelism. The regular splitting strategy when applied locally impacts significantly the number of off-diagonal blocks and might have negative effect on the efficiency. In this talk, we present both a new strategy to improve supernodes ordering and splitting strategy that both enlarge the off-diagonal block sizes without changing the computational cost of the factorization. Performance improvement gains on the supernodal solver PaStiX are shown on multi-cores and heterogeneous architectures.

    @Misc{c:LaBRI::HOSCAR2015,
    author = {Faverge, M. and Pichon, G. and Ramet, P. and Roman, J.},
    title = {{Blocking strategy optimizations for sparse direct linear solver on heterogeneous architectures}},
    OPTcrossref = {},
    OPTkey = {},
    howpublished = {Workshop INRIA-CNPq, HOSCAR meeting, Sophia-Antipolis, France},
    OPTpages = {},
    year = {2015},
    OPTeditor = {},
    OPTvolume = {},
    OPTnumber = {},
    OPTseries = {},
    month = sep,
    OPTorganization = {},
    OPTpublisher = {},
    OPTnote = {},
    OPTannote = {},
    KEYWORDS = "Sparse",
    ABSTRACT = { In the context of solving sparse linear systems, an ordering process partitions the matrix graph to minimize both fill-in and computational cost. We found that the ordering strategy used within supernodes might be enhanced to reduce the number of off-diagonal blocks, and then increases block sizes and kernel performance. This turns to be into the same complexity as the factorization algorithm, but allows for more efficient BLAS kernels. On the other side, supernodes that are too large need to be split to create more parallelism. The regular splitting strategy when applied locally impacts significantly the number of off-diagonal blocks and might have negative effect on the efficiency. In this talk, we present both a new strategy to improve supernodes ordering and splitting strategy that both enlarge the off-diagonal block sizes without changing the computational cost of the factorization. Performance improvement gains on the supernodal solver PaStiX are shown on multi-cores and heterogeneous architectures. } 
    }
    


  5. Y. Laizet, A. Moreau, J.-M. Frigerio, Ph. Chaumeil, P. Gay, P. Ramet, D. Sherman, and A. Franc. Biodiversiton : application du HPC à l'étude de la biodiversité. Seminar at MCIA (Mésocentre de Calcul Intensif Aquitain), March 2015.
    @Misc{c:LaBRI::MCIA15,
    OPTkey = {},
    author = {Laizet, Y. and Moreau, A. and Frigerio, J.-M. and Chaumeil, Ph. and Gay, P. and Ramet, P. and Sherman, D. and Franc, A.},
    title = {Biodiversiton : application du HPC \`a l'\'etude de la biodiversit\'e},
    howpublished = {Seminar at MCIA (M\'esocentre de Calcul Intensif Aquitain)},
    month = mar,
    year = 2015,
    OPTnote = {},
    OPTannote = {},
    URL = {http://www.labri.fr/~ramet/restricted/biodiversiton.pdf},
    
    }
    


  6. P. Ramet. On the design of parallel linear solvers for large scale problems. Formation CNRS, Journée problème de Poisson, Paris, France, January 2015.
    @Misc{c:LaBRI::CNRS15,
    author = {Ramet, P.},
    title = {On the design of parallel linear solvers for large scale problems},
    month = jan,
    year = {2015},
    howpublished = {Formation CNRS, Journ\'ee probl\`eme de Poisson, Paris, France} 
    }
    


  7. P. Ramet. Solveurs Directs. Maison de la Simulation, Formation PATC, Algèbre Linéaire Creuse Parallèle, Paris, France, April 2015.
    @Misc{c:LaBRI::MDS15,
    author = {Ramet, P.},
    title = {Solveurs Directs},
    month = apr,
    year = {2015},
    howpublished = {Maison de la Simulation, Formation PATC, Alg\`ebre Lin\'eaire Creuse Parall\`ele, Paris, France} 
    }
    



BACK TO INDEX




Disclaimer:

This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All person copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. In most cases, these works may not be reposted without the explicit permission of the copyright holder.

Les documents contenus dans ces répertoires sont rendus disponibles par les auteurs qui y ont contribué en vue d'assurer la diffusion à temps de travaux savants et techniques sur une base non-commerciale. Les droits de copie et autres droits sont gardés par les auteurs et par les détenteurs du copyright, en dépit du fait qu'ils présentent ici leurs travaux sous forme électronique. Les personnes copiant ces informations doivent adhérer aux termes et contraintes couverts par le copyright de chaque auteur. Ces travaux ne peuvent pas être rendus disponibles ailleurs sans la permission explicite du détenteur du copyright.




Last modified: Tue Apr 4 11:58:35 2023
Author: ramet.


This document was translated from BibTEX by bibtex2html