This article describes the parallel version of a preconditioner for these systems, presented in its sequential form in [7]. It consists of three steps: in the first step a local decoupling of the pressure and saturation unknowns aims at concentrating in the ``pressure block'' the elliptic part of the system which is then, in the second step, preconditioned by AMG. The third step finally consists in recoupling the equations. Each step is efficiently parallelised using a partitioning of the domain into vertical layers along the $y$-axis and a distributed memory model within the PETSc library (Argonne National Laboratory, IL). The main new ingredient in the parallel version is a parallel AMG preconditioner for the pressure block, for which we use the BoomerAMG implementation in the Hypre library [4].
Numerical results on real case studies, exhibit (i) a
significant reduction of CPU times, up to a factor 5 with respect
to a block Jacobi preconditioner with an ILU(0) factorisation of each
block, (ii) robustness with respect to heterogeneities,
anisotropies and high migration ratios, and (iii) a speedup of up
to 4 on 8 processors.