You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The PSBLAS library, developed with the aim to facilitate the parallelization of computationally intensive scientific applications, is designed to address parallel implementation of iterative solvers for sparse linear systems through the distributed memory paradigm. It includes routines for multiplying sparse matrices by dense matrices, solving block diagonal systems with triangular diagonal entries, preprocessing sparse matrices, and contains additional routines for dense matrix operations. The current implementation of PSBLAS addresses a distributed memory execution model operating with message passing.
20
+
21
+
The PSBLAS library version 3 is implemented in the Fortran 2008 programming language, with reuse and/or adaptation of existing Fortran 77 and Fortran 95 software, plus a handful of C routines.
| Version 3.0-RC3 | July 23, 2025 |[{:height="24px" width="24px"}](https://github.com/sfilippone/psblas3/archive/refs/tags/v3.9.0-rc3.zip)[{:height="24px" width="24px"}](https://github.com/sfilippone/psblas3/archive/refs/tags/v3.9.0-rc3.tar.gz)|[{:height="24px" width="24px"}](/psblasguide/psblas-3.9-rc3.pdf){:target="_blank"} |
30
+
| Version 3.8.1-2 | November 13, 2023 |[{:height="24px" width="24px"}](https://github.com/sfilippone/psblas3/archive/refs/tags/v3.8.1-2.zip)[{:height="24px" width="24px"}](https://github.com/sfilippone/psblas3/archive/refs/tags/v3.8.1-2.tar.gz)|[{:height="24px" width="24px"}](/psblasguide/psblas-3.8.pdf){:target="_blank"} [{:height="24px" width="24px"}](https://psctoolkit.github.io/psblasguide/index.html){:target="_blank"} |
31
+
| Version 3.8.1 | September 29, 2023 |[{:height="24px" width="24px"}](https://github.com/sfilippone/psblas3/archive/refs/tags/v3.8.1.zip)[{:height="24px" width="24px"}](https://github.com/sfilippone/psblas3/archive/refs/tags/v3.8.1.tar.gz)|[{:height="24px" width="24px"}](/psblasguide/psblas-3.8.pdf){:target="_blank"} [{:height="24px" width="24px"}](https://psctoolkit.github.io/psblasguide/index.html){:target="_blank"} |
32
+
| Version 3.8.0-2 | August 5, 2022 |[{:height="24px" width="24px"}](https://github.com/sfilippone/psblas3/archive/refs/tags/v3.8.0-2.zip)[{:height="24px" width="24px"}](https://github.com/sfilippone/psblas3/archive/refs/tags/v3.8.0-2.tar.gz)|[{:height="24px" width="24px"}](/psblasguide/psblas-3.8.pdf){:target="_blank"} [{:height="24px" width="24px"}](https://psctoolkit.github.io/psblasguide/index.html){:target="_blank"} |
33
+
| Version 3.8.0 | May 24, 2022 |[{:height="24px" width="24px"}](https://github.com/sfilippone/psblas3/archive/refs/tags/v3.8.0.zip)[{:height="24px" width="24px"}](https://github.com/sfilippone/psblas3/archive/refs/tags/v3.8.0.tar.gz)|[{:height="24px" width="24px"}](/psblasguide/psblas-3.8.pdf){:target="_blank"} [{:height="24px" width="24px"}](https://psctoolkit.github.io/psblasguide/index.html){:target="_blank"} |
34
+
| Version 3.7.0.2 | September 24, 2021 |[{:height="24px" width="24px"}](https://github.com/sfilippone/psblas3/archive/refs/tags/v3.7.0.2.zip)[{:height="24px" width="24px"}](https://github.com/sfilippone/psblas3/archive/refs/tags/v3.7.0.2.tar.gz)|[{:height="24px" width="24px"}](/psblasguide/psblas-3.7.0.1.pdf){:target="_blank"} [{:height="24px" width="24px"}](https://psctoolkit.github.io/psblasguide/index.html){:target="_blank"} |
35
+
| Version 3.7.0.1 | May 11, 2021 |[{:height="24px" width="24px"}](https://github.com/sfilippone/psblas3/archive/refs/tags/V3.7.0.1.zip)[{:height="24px" width="24px"}](https://github.com/sfilippone/psblas3/archive/refs/tags/V3.7.0.1.tar.gz)|[{:height="24px" width="24px"}](/psblasguide/psblas-3.7.0.1.pdf){:target="_blank"} [{:height="24px" width="24px"}](https://psctoolkit.github.io/psblasguide/index.html){:target="_blank"} |
36
+
| Version 3.7.0-1 | April 13, 2021 |[{:height="24px" width="24px"}](https://github.com/sfilippone/psblas3/archive/refs/tags/V3.7.0-1.zip)[{:height="24px" width="24px"}](https://github.com/sfilippone/psblas3/archive/refs/tags/V3.7.0-1.tar.gz)|[{:height="24px" width="24px"}](/psblasguide/psblas-3.7.pdf){:target="_blank"} [{:height="24px" width="24px"}](https://psctoolkit.github.io/psblasguide/index.html){:target="_blank"} |
37
+
| Version 3.7.0 | April 12, 2021 |[{:height="24px" width="24px"}](https://github.com/sfilippone/psblas3/archive/refs/tags/v3.7.0.zip)[{:height="24px" width="24px"}](https://github.com/sfilippone/psblas3/archive/refs/tags/v3.7.0.tar.gz)|[{:height="24px" width="24px"}](/psblasguide/psblas-3.7.pdf){:target="_blank"} [{:height="24px" width="24px"}](https://psctoolkit.github.io/psblasguide/index.html){:target="_blank"} |
38
+
39
+
Library releases can be downloaded from: [psblas3/releases](https://github.com/sfilippone/psblas3/releases)
40
+
41
+
42
+
## References
43
+
44
+
45
+
The architecture, philosophy and implementation details of the library are contained in the following papers:
46
+
47
+
- The architecture of the Fortran 2003 sparse BLAS is described in:
48
+
>S. Filippone, A. Buttari. Object-Oriented Techniques for Sparse Matrix
49
+
>Computations in Fortran 2003, ACM Trans. on Math. Software, vol. 38, No.
50
+
4, 2012.
51
+
52
+
- The software engineering ideas are further detailed in the paper:
53
+
>V. Cardellini, S. Filippone and D. Rouson. Design Patterns for
54
+
>sparse-matrix computations on hybrid CPU/GPU platforms, Scientific
55
+
>Programming, 22(2014), pp.1-19.
56
+
57
+
- The GPU support is explored in
58
+
> S. Filippone, V. Cardellini, D. Barbieri and A. Fanfarillo:
59
+
> Sparse Matrix-Vector Multiplication on GPGPUs ACM Transactions on Mathematical Software (TOMS), Volume 43 Issue 4, December 2016.
60
+
61
+
- Version 1.0 of the library is described in:
62
+
>S. Filippone, M. Colajanni. PSBLAS: A library for parallel linear
63
+
>algebra computation on sparse matrices, ACM Trans. on Math. Software,
64
+
>26(4), Dec. 2000, pp. 527-550.
65
+
- The software infrastructure changes required to accommodate the implementation of the
66
+
Additive-Schwarz preconditioners available in [AMG4PSBLAS](https://github.com/sfilippone/amg4psblas/) are detailed in:
67
+
> A. Buttari, P. D'Ambra, D. di Serafino, S. Filippone, Extending PSBLAS to build parallel Schwarz preconditioners, Applied Parallel Computing. State of the Art in Scientific Computing: 7th International Workshop, PARA 2004, LNCS 3732, 2006, pp. 593-602.
68
+
69
+
> A. Buttari, P. D'Ambra, D. Di Serafino, S. Filippone, 2LEV-D2P4: A package of high-performance preconditioners for scientific and engineering applications, Applicable Algebra in Engineering, Communications and Computing, 2007, 18(3), pp. 223-239.
70
+
71
+
> P. D'Ambra, D. Di Serafino, S. Filippone, MLD2P4: A package of parallel algebraic multilevel domain decomposition preconditioners in Fortran 95 ACM Transactions on Mathematical Software, 2010, 37(3), 30
72
+
73
+
PSBLAS is the backbone of the Parallel Sparse Computation Toolkit ([PSCToolkit](https://psctoolkit.github.io/)) suite of libraries. See the paper:
We originally included a modified implementation of some of the Sparker
79
+
(serial sparse BLAS) material; this has been completely rewritten, way
80
+
beyond the intention(s) and responsibilities of the original developers.
81
+
The main reference for the serial sparse BLAS is:
82
+
>Duff, I., Marrone, M., Radicati, G., and Vittoli, C. Level 3 basic
83
+
>linear algebra subprograms for sparse matrices: a user level interface,
84
+
>ACM Trans. Math. Softw., 23(3), 379-401, 1997.
85
+
86
+
## Installing
87
+
88
+
To compile and run our software you will need the following
89
+
prerequisites (see also SERIAL below):
90
+
91
+
1. A working version of MPI
92
+
93
+
2. A version of the BLAS; if you don't have a specific version for your
94
+
platform you may try ATLAS available from
95
+
http://math-atlas.sourceforge.net/
96
+
97
+
3. We have had good results with the METIS library, from
98
+
https://github.com/KarypisLab/METIS.
99
+
This is optional; it is used in the util and test/fileread
100
+
directories but only if you specify `--with-metis`.
101
+
102
+
5. If you have the AMD package of Davis, Duff and Amestoy, you can
103
+
specify `--with-amd` (see `./configure --help` for more details).
104
+
We use the C interface to AMD.
105
+
106
+
6. If you have CUDA available, use
107
+
--enable-cuda to compile CUDA-enabled methods
108
+
--with-cudadir=<path> to specify the CUDA toolkit location
109
+
--with-cudacc=XX,YY,ZZ to specify a list of target CCs (compute
110
+
capabilities) to compile the CUDA code for.
111
+
112
+
The configure script will generate a Make.inc file suitable for building
113
+
the library. The script is capable of recognizing the needed libraries
114
+
with their default names; if they are in unusual places consider adding
115
+
the paths with `--with-libs`, or explicitly specifying the names in
116
+
`--with-blas`, etc.
117
+
118
+
>[!CAUTION]
119
+
> Please note that a common way for the configure script
120
+
> to fail is to specify inconsistent MPI vs. plain compilers, either
121
+
> directly or indirectly via environment variables; e.g. specifying the
122
+
> Intel compiler with `FC=ifort` while at the same time having an
123
+
> `MPIFC=mpif90` which points to GNU Fortran.
124
+
125
+
>[!TIP]
126
+
> The best way to avoid this
127
+
> situation is (in our opinion) to use the environment modules package
128
+
> (see [http://modules.sourceforge.net/](http://modules.sourceforge.net/)), and load the relevant
129
+
> variables with (e.g.)
130
+
> ```
131
+
> module load gcc/13.2.0 openmpi/4.1.6
132
+
> ```
133
+
> This will delegate to the modules setup to make sure that the version of
134
+
> openmpi in use is the one compiled with the gnu46 compilers. After the
135
+
> configure script has completed you can always tweak the Make.inc file
136
+
> yourself.
137
+
138
+
After you have Make.inc fixed, run
139
+
```
140
+
make
141
+
```
142
+
to compile the library; go to the test directory and its subdirectories
143
+
to get test programs done. If you specify `--prefix=/path` you can do make
144
+
install and the libraries will be installed under `/path/lib`, while the
145
+
module files will be installed under `/path/modules`. The regular and
146
+
experimental C interface header files are under `/path/include`.
147
+
148
+
### Packaging changes, CUDA and GPU support
149
+
150
+
This version of PSBLAS incorporates into a single package three
151
+
entities that were previously separated:
152
+
| Library | |
153
+
|---------|--------------------|
154
+
| PSBLAS | the base library |
155
+
| PSBLAS-EXT | a library providing additional storage formats for matrices and vectors |
156
+
| SPGPU | a package of kernels for NVIDIA GPUs originally written by Davide Barbieri and Salvatore Filippone; see the license file [cuda/License-spgpu.md](cuda/License-spgpu.md) |
157
+
158
+
Moreover, the module and library previously called psb_krylovv are now called
159
+
psb_linsolve, but their usage is otherwise unchanged.
160
+
161
+
### OpenACC
162
+
There is a highly experimental version of an OpenACC interface,
0 commit comments