Skip to content
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
87 commits
Select commit Hold shift + click to select a range
cd1ffc9
First commit for the GmresMixed class:
Apr 27, 2020
d7ba04c
Inclusion of the omp executor in the repository:
Apr 27, 2020
969358f
Inclusion of the cuda executor in the repository:
Apr 29, 2020
2b7122f
The test files are finally included, but:
Apr 29, 2020
6dc6470
The first unoptimized version is done. Next tasks:
Apr 30, 2020
4d4fab1
The default value for ValueTypeKrylovBasis has been changed to avoid …
josealiaga Apr 30, 2020
e0d39a6
Definition of the CG2 variant of the finish_arnoldi routine for omp a…
josealiaga May 3, 2020
27019da
Add GMRES_mixed to benchmark
May 7, 2020
f2b5a3a
Definition of the CGS2 version of finish_arnoldi method, for omp and …
josealiaga May 7, 2020
bfd4196
Finally a good implementation of the multidot_kernels_num_iters_1 is …
josealiaga May 12, 2020
4ad491e
Add Accessor support and extend reference test
May 13, 2020
fbcc4f2
Made GmresMixed compile with complex types
May 13, 2020
911fd5c
The update routines have been improved. Now the computational time is…
josealiaga May 19, 2020
db081ac
Add specialization for integer types for Accessor
May 19, 2020
7db4796
Make the scale work with integer types
May 20, 2020
9bbcf87
Add helper to determine if we need a scale or not
May 20, 2020
97689ab
Add a helper structure to manage the scale writing in common
May 20, 2020
c0aa2ed
Testing the push command
josealiaga May 20, 2020
da8a56e
Definition of norm2 and norminf routines in CUDA. Only the first one …
josealiaga May 22, 2020
23efe7a
remove_complex has been added to the norms variables, and multinorm2_…
josealiaga May 24, 2020
f3cdbd7
Fixed cuda step2 to take a view into account
May 25, 2020
d4865f2
Change const accessor to non-const in check_arnoldi_norms_new
May 25, 2020
42a779b
The set_scale method finally works!!
josealiaga May 27, 2020
6b25d2e
Change storage layout of krylov_bases
Jun 1, 2020
e3ca7cc
Make memory access to krylov_bases coalesced again
Jun 2, 2020
911fccb
Transpose grid when launching singledot kernel
Jun 4, 2020
5dd8bf2
Reversed the transpose of the grid dim
Jun 4, 2020
b214a5f
Add half precision support to GmresMixed
Jun 15, 2020
f1f178e
Hopefully improve singledot performance
Jun 15, 2020
fb64f5c
Infinity norm only computed when scale is present
Jun 17, 2020
155c552
Add another RHS generation in the benchmark
Jun 23, 2020
d660c47
Fix residual_norm calculation in GmresMixed
Jun 23, 2020
8ede6d6
Make sure GmresMixed does not exit early
Jun 25, 2020
af6fb48
Add benchmark parameter for GMRES krylov_dim
Jun 26, 2020
49f1763
Add forced iterations when convergence is detected
Jun 26, 2020
188ff18
Add debug output to forced iterations
Jun 26, 2020
297004b
Fix reference bug in GmresMixed
Jul 8, 2020
b7765c3
DEBUG: Add write output for integral accessor
Jul 30, 2020
f31e87d
DEBUG: Move towards `at` with accessor
Aug 14, 2020
6f82f47
Remove Accessor3dConst
Aug 17, 2020
c385a2b
Adopt OpenMP support to new Accessor
Aug 17, 2020
9a0a0f0
Remove unused GMRES_mixed code from Ref & OMP
Aug 17, 2020
3d44a21
Adopt CUDA to the new accessor format (NOT `at`)
Aug 17, 2020
e1654b4
Make HIP and CUDA work with new accessor (NOT at)
Aug 17, 2020
5d09f2a
Remove unused code from CUDA
Aug 17, 2020
48e0899
CUDA implementation is now using `at`
Aug 18, 2020
b542646
Re-add ConstAccessor
Aug 18, 2020
b0a2ac3
Fix accessor by adding additional __restrict__
Aug 19, 2020
5d1f173
GmresMixed storage prec is now a factory parameter
Aug 25, 2020
267bbf1
Improve reference test and include the enum there
Aug 25, 2020
6b92a4f
Fix the reference test to pass
Aug 26, 2020
1d4a773
Adopt to new parameter macros
Sep 1, 2020
30381e2
Update the helper to throw when complex
Sep 7, 2020
6d30e36
Make GmresMixed work properly with multiple RHS
Sep 9, 2020
7f86d9c
Fix benchmark to work with new GmresMixed layout
Sep 10, 2020
b2b9ebc
Use new reduced_row_major Accessor in GmresMixed
Sep 11, 2020
db3046f
Remove unnecessary code from CUDA GmresMixed
Sep 14, 2020
da87d06
Add HIP kernels
Sep 14, 2020
65a8a8b
Fix GmresMixed core problem
Sep 17, 2020
a9646e8
Improve force-reset behavior
Dec 12, 2020
c6020db
Rename GmresMixed to CbGmres
Jan 22, 2021
36a1a19
Format files
ginkgo-bot Jan 23, 2021
c4a7335
Add DPCPP stubs to allow compilation
Jan 24, 2021
fa1fc39
Make cb-gmres benchmarks dependent on etype
Jan 28, 2021
c509263
Fix implementation and reference test for CB-GMRES
Feb 1, 2021
6bb09da
Update tolerance for one reference CB-GMRES test
Feb 1, 2021
3655021
Update atomic_max
Feb 2, 2021
5270478
Remove unnecessary kernels and properly name them
Feb 2, 2021
2f0a32e
Review update
Feb 2, 2021
58e1104
Add Helper INSTANTIATE macro for CB-GMRES
Feb 3, 2021
c47a4ca
Remove CB-GMRES and GMRES example
Feb 3, 2021
2b587ba
Review update
Feb 3, 2021
da8a6b4
Remove unnecessary includes of iostream and time.h
Feb 3, 2021
c4c1270
Remove circular dependency of compute_norm2 in (CB)-GMRES
Feb 3, 2021
00d33b5
Update solver generation in benchmark
Feb 3, 2021
b163893
Update eta and arnoldi_norms in CB-GMRES
Feb 4, 2021
342957a
Remove CUDA 9.0 exception for constexpr parameter
Feb 4, 2021
370d208
Review Update
Feb 5, 2021
1b2071d
Sonarcloud update
Feb 6, 2021
b4a6fc9
Review update; Improve run_all_benchmarks.sh
Feb 8, 2021
599e261
Put storage_precision enum into cb_gmres namespace
Feb 9, 2021
5126858
Add CB-GMRES example
Feb 9, 2021
169040d
Remove unnecessary included files for CB-GMRES
Feb 9, 2021
05374fd
Review update
Feb 15, 2021
389d038
Review update
Feb 16, 2021
6d6bbab
Update contributors.txt
josealiaga Feb 19, 2021
4d722f1
Update contributors.txt
josealiaga Feb 19, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Add another RHS generation in the benchmark
- The benchmark script now has the option to change the initial guess
- Add option to generate the RHS with:
  b = A * (s / |s|) with s(i) = sin(i)
  • Loading branch information
Thomas Grützmacher committed Feb 19, 2021
commit 155c552411557df5f453cb7427bc93aabedfa2ae
13 changes: 10 additions & 3 deletions BENCHMARKING.md
Original file line number Diff line number Diff line change
Expand Up @@ -143,7 +143,7 @@ The benchmark suite can take a number of configuration parameters. Benchmarks
can be run only for `sparse matrix vector products (spmv)`, for full solvers
(with or without preconditioners), or for preconditioners only when supported.
The benchmark suite also allows to target a sub-part of the SuiteSparse matrix
collection. For details, see the [available benchmark options](### 5: Available
collection. For details, see the [available benchmark options](### 6: Available
benchmark options). Here are the most important options:
* `BENCHMARK={spmv, solver, preconditioner}` - allows to select the type of
benchmark to be ran.
Expand Down Expand Up @@ -303,9 +303,16 @@ The supported environment variables are described in the following list:
the solver should stop. The default is `1e-6`.
* `SOLVERS_MAX_ITERATION=<number>` - the maximum number of iterations with which
a solver should be ran. The default is `10000`.
* `SOLVERS_RHS={unit, random}` - whether to use a vector of all ones or random
values as the right-hand side in solver benchmarks. Default is `unit`.
* `SOLVERS_RHS={1, random, sinus}` - whether to use a vector of all ones,
random values or $b = A * (s / |s|)$ with $s(i) = sin(i)$
Comment thread
thoasm marked this conversation as resolved.
Outdated
as the right-hand side in solver benchmarks. Default is `1`.
* `SOLVERS_INITIAL_GUESS`={rhs,0,random} - the initial guess generation of the
Comment thread
thoasm marked this conversation as resolved.
Outdated
solvers. `rhs` uses the right-hand side, `0` uses a zero vector and `random`
generates a random vector as the initial guess.
* `DETAILED={0,1}` - selects whether detailed benchmarks should be ran for the
solver benchmarks, can be either `0` (off) or `1` (on). The default is `0`.
* `GPU_TIMER={true, false}` - If set to `true`, use the gpu timer, which is
valid for cuda/hip executor, to measure the timing. Default is `false`.
* `SOLVERS_JACOBI_MAX_BS` - sets the maximum block size for the Jacobi
preconditioner (if used, otherwise, it does nothing) in the solvers
benchmark
Comment thread
thoasm marked this conversation as resolved.
Outdated
41 changes: 36 additions & 5 deletions benchmark/run_all_benchmarks.sh
Original file line number Diff line number Diff line change
Expand Up @@ -60,9 +60,15 @@ if [ ! "${DEVICE_ID}" ]; then
DEVICE_ID="0"
fi

if [ ! "${SOLVERS_JACOBI_MAX_BS}" ]; then
SOLVERS_JACOBI_MAX_BS="32"
"SOLVERS_JACOBI_MAX_BS environment variable not set - assuming \"${SOLVERS_JACOBI_MAX_BS}\"" 1>&2
fi


Comment thread
thoasm marked this conversation as resolved.
Outdated
if [ ! "${SOLVERS_RHS}" ]; then
echo "SOLVERS_RHS environment variable not set - assuming \"unit\"" 1>&2
SOLVERS_RHS="unit"
SOLVERS_RHS="1"
echo "SOLVERS_RHS environment variable not set - assuming \"${SOLVERS_RHS}\"" 1>&2
fi

if [ ! "${BENCHMARK_PRECISION}" ]; then
Expand All @@ -85,16 +91,39 @@ else
fi

if [ "${SOLVERS_RHS}" == "random" ]; then
SOLVERS_RHS_FLAG="--random_rhs=true"
SOLVERS_RHS_FLAG="--rhs_generation=random"
elif [ "${SOLVERS_RHS}" == "1" ]; then
SOLVERS_RHS_FLAG="--rhs_generation=1"
elif [ "${SOLVERS_RHS}" == "sinus" ]; then
SOLVERS_RHS_FLAG="--rhs_generation=sinus"
else
SOLVERS_RHS_FLAG="--random_rhs=false"
echo "SOLVERS_RHS does not support the value \"${SOLVERS_RHS}\"." 1>&2
echo "The following values are supported: \"1\", \"random\" and \"sinus\"" 1>&2
exit 1
fi

if [ ! "${SOLVERS_INITIAL_GUESS}" ]; then
SOLVERS_INITIAL_GUESS="rhs"
echo "SOLVERS_RHS environment variable not set - assuming \"${SOLVERS_INITIAL_GUESS}\"" 1>&2
fi

if [ ! "${GPU_TIMER}" ]; then
echo "GPU_TIMER environment variable not set - assuming \"false\"" 1>&2
GPU_TIMER="false"
fi

if [ "${SOLVERS_INITIAL_GUESS}" == "random" ]; then
SOLVERS_INITIAL_GUESS_FLAG="--initial_guess_generation=random"
elif [ "${SOLVERS_INITIAL_GUESS}" == "0" ]; then
SOLVERS_INITIAL_GUESS_FLAG="--initial_guess_generation=0"
elif [ "${SOLVERS_INITIAL_GUESS}" == "rhs" ]; then
SOLVERS_INITIAL_GUESS_FLAG="--initial_guess_generation=rhs"
else
echo "SOLVERS_RHS does not support the value \"${SOLVERS_RHS}\"." 1>&2
echo "The following values are supported: \"1\", \"random\" and \"sinus\"" 1>&2
Comment thread
tcojean marked this conversation as resolved.
Outdated
exit 1
fi

# Control whether to run detailed benchmarks or not.
# Default setting is detailed=false. To activate, set DETAILED=1.
if [ ! "${DETAILED}" ] || [ "${DETAILED}" -eq 0 ]; then
Expand Down Expand Up @@ -202,7 +231,9 @@ run_solver_benchmarks() {
--executor="${EXECUTOR}" --solvers="${SOLVERS}" \
--preconditioners="${PRECONDS}" \
--max_iters=${SOLVERS_MAX_ITERATIONS} --rel_res_goal=${SOLVERS_PRECISION} \
${SOLVERS_RHS_FLAG} ${DETAILED_STR} --device_id="${DEVICE_ID}" --gpu_timer=${GPU_TIMER} \
${SOLVERS_RHS_FLAG} ${DETAILED_STR} ${SOLVERS_INITIAL_GUESS_FLAG} \
--gpu_timer=${GPU_TIMER} \
--jacobi_max_block_size=${SOLVERS_JACOBI_MAX_BS} --device_id="${DEVICE_ID}" \
<"$1.imd" 2>&1 >"$1"
keep_latest "$1" "$1.bkp" "$1.bkp2" "$1.imd"
}
Expand Down
89 changes: 72 additions & 17 deletions benchmark/solver/solver.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -89,11 +89,19 @@ DEFINE_double(
idr_kappa, 0.7,
"the number to check whether Av_n and v_n are too close or not in IDR");

DEFINE_bool(random_rhs, false,
"Use a random vector for the rhs (otherwise use all ones)");

DEFINE_bool(random_initial_guess, false,
"Use a random vector for the initial guess (otherwise use rhs)");
DEFINE_string(
rhs_generation, "1",
"Method used to generate the right hand side. Supported values are:"
"1, random, sinus. 1 sets all values of the right hand side to 1, "
Comment thread
thoasm marked this conversation as resolved.
Outdated
"random assigns the values to a uniformly distributed random number "
"in [-1, 1), and sinus assigns b = A * (s / |s|) with A := system matrix, "
"s := vector with s(i) = sin(i).");

DEFINE_string(
initial_guess_generation, "rhs",
"Method used to generate the initial guess. Supported values are: "
"random, rhs, 0. random uses a random vector, rhs uses the right "
Comment thread
thoasm marked this conversation as resolved.
Outdated
"hand side, and 0 uses a zero vector as the initial guess.");
Comment thread
thoasm marked this conversation as resolved.
Outdated

// This allows to benchmark the overhead of a solver by using the following
// data: A=[1.0], x=[0.0], b=[nan]. This data can be used to benchmark normal
Expand All @@ -118,6 +126,62 @@ DEFINE_bool(overhead, false,
}


template <typename Engine>
std::unique_ptr<vec<etype>> generate_rhs(
std::shared_ptr<const gko::Executor> exec,
std::shared_ptr<const gko::LinOp> system_matrix, Engine engine)
{
using rc_etype = gko::remove_complex<etype>;
gko::dim<2> vec_size{system_matrix->get_size()[0], FLAGS_nrhs};
if (FLAGS_rhs_generation == "1") {
return create_matrix<etype>(exec, vec_size, gko::one<etype>());
} else if (FLAGS_rhs_generation == "random") {
return create_matrix<etype>(exec, vec_size, engine);
} else if (FLAGS_rhs_generation == "sinus") {
auto rhs = vec<etype>::create(exec, vec_size);

auto tmp = create_matrix_sin<etype>(exec, vec_size);
auto scalar = gko::matrix::Dense<rc_etype>::create(
exec->get_master(), gko::dim<2>{1, vec_size[1]});
tmp->compute_norm2(scalar.get());
for (gko::size_type i = 0; i < vec_size[1]; ++i) {
scalar->at(0, i) = gko::one<rc_etype>() / scalar->at(0, i);
}
// normalize sin-vector
if (gko::is_complex_s<etype>::value) {
tmp->scale(scalar->make_complex().get());
} else {
tmp->scale(scalar.get());
}
system_matrix->apply(tmp.get(), rhs.get());
return rhs;
}
throw std::invalid_argument(std::string("\"rhs_generation\" = ") +
FLAGS_rhs_generation + " is not supported!");
}


template <typename Engine>
std::unique_ptr<vec<etype>> generate_initial_guess(
std::shared_ptr<const gko::Executor> exec,
std::shared_ptr<const gko::LinOp> system_matrix, const vec<etype> *rhs,
Engine engine)
{
using rc_etype = gko::remove_complex<etype>;
gko::dim<2> vec_size{system_matrix->get_size()[1], FLAGS_nrhs};
if (FLAGS_initial_guess_generation == "0") {
return create_matrix<etype>(exec, vec_size, gko::zero<etype>());
} else if (FLAGS_initial_guess_generation == "random") {
return create_matrix<etype>(exec, vec_size, engine);
} else if (FLAGS_initial_guess_generation == "rhs") {
return rhs->clone();
}
throw std::invalid_argument(std::string("\"initial_guess_generation\" = ") +
FLAGS_initial_guess_generation +
" is not supported!");
}


void validate_option_object(const rapidjson::Value &value)
{
if (!value.IsObject() || !value.HasMember("optimal") ||
Expand Down Expand Up @@ -552,19 +616,10 @@ int main(int argc, char *argv[])
std::ifstream rhs_fd{test_case["rhs"].GetString()};
b = gko::read<Vec>(rhs_fd, exec);
} else {
b = create_matrix<etype>(
exec,
gko::dim<2>{system_matrix->get_size()[0], FLAGS_nrhs},
engine, FLAGS_random_rhs);
}
if (FLAGS_random_initial_guess) {
x = create_matrix<etype>(
exec,
gko::dim<2>{system_matrix->get_size()[0], FLAGS_nrhs},
engine);
} else {
x = b->clone();
b = generate_rhs(exec, system_matrix, engine);
}
x = generate_initial_guess(exec, system_matrix, b.get(),
engine);
}

std::clog << "Matrix is of size (" << system_matrix->get_size()[0]
Expand Down
68 changes: 52 additions & 16 deletions benchmark/utils/general.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -287,22 +287,51 @@ template <typename ValueType>
using vec = gko::matrix::Dense<ValueType>;


// creates a zero vector
// Create a matrix with value indices s[i, j] = sin(i)
template <typename ValueType>
std::unique_ptr<vec<ValueType>> create_vector(
std::shared_ptr<const gko::Executor> exec, gko::size_type size)
std::enable_if_t<!gko::is_complex_s<ValueType>::value,
std::unique_ptr<vec<ValueType>>>
create_matrix_sin(std::shared_ptr<const gko::Executor> exec, gko::dim<2> size)
{
auto h_res = vec<ValueType>::create(exec->get_master(), size);
for (gko::size_type i = 0; i < size[0]; ++i) {
for (gko::size_type j = 0; j < size[1]; ++j) {
h_res->at(i, j) = std::sin(static_cast<ValueType>(i));
}
}
auto res = vec<ValueType>::create(exec);
res->read(gko::matrix_data<ValueType>(gko::dim<2>{size, 1}));
h_res->move_to(res.get());
return res;
}

// Note: complex values are assigned s[i, j] = {sin(2 * i, sin(2 * i + 1))}
template <typename ValueType>
std::enable_if_t<gko::is_complex_s<ValueType>::value,
std::unique_ptr<vec<ValueType>>>
create_matrix_sin(std::shared_ptr<const gko::Executor> exec, gko::dim<2> size)
{
auto h_res = vec<ValueType>::create(exec->get_master(), size);
for (gko::size_type i = 0; i < size[0]; ++i) {
for (gko::size_type j = 0; j < size[1]; ++j) {
h_res->at(i, j) = ValueType{
std::sin(static_cast<gko::remove_complex<ValueType>>(2 * i)),
std::sin(
static_cast<gko::remove_complex<ValueType>>(2 * i + 1))};
Comment thread
thoasm marked this conversation as resolved.
Outdated
}
}
auto res = vec<ValueType>::create(exec);
h_res->move_to(res.get());
return res;
}


template <typename ValueType>
std::unique_ptr<vec<ValueType>> create_matrix(
std::shared_ptr<const gko::Executor> exec, gko::dim<2> size)
std::shared_ptr<const gko::Executor> exec, gko::dim<2> size,
ValueType value)
{
auto res = vec<ValueType>::create(exec);
res->read(gko::matrix_data<ValueType>(size));
res->read(gko::matrix_data<ValueType>(size, value));
return res;
}

Expand All @@ -311,18 +340,25 @@ std::unique_ptr<vec<ValueType>> create_matrix(
template <typename ValueType, typename RandomEngine>
std::unique_ptr<vec<ValueType>> create_matrix(
std::shared_ptr<const gko::Executor> exec, gko::dim<2> size,
RandomEngine &engine, bool random = true)
RandomEngine &engine)
{
auto res = vec<ValueType>::create(exec);
if (random) {
res->read(gko::matrix_data<ValueType>(
size,
std::uniform_real_distribution<gko::remove_complex<ValueType>>(-1.0,
1.0),
engine));
} else {
res->read(gko::matrix_data<ValueType>(size, gko::one<ValueType>()));
}
res->read(gko::matrix_data<ValueType>(
size,
std::uniform_real_distribution<gko::remove_complex<ValueType>>(-1.0,
1.0),
engine));
return res;
}


// creates a zero vector
template <typename ValueType>
std::unique_ptr<vec<ValueType>> create_vector(
std::shared_ptr<const gko::Executor> exec, gko::size_type size)
{
auto res = vec<ValueType>::create(exec);
res->read(gko::matrix_data<ValueType>(gko::dim<2>{size, 1}));
return res;
}

Expand Down