Commit 4db14c0e authored by Nathanaël Schaeffer's avatar Nathanaël Schaeffer
Browse files

Update README.md and doc

parent c394dfb4
...@@ -64,16 +64,31 @@ DOCUMENTATION: ...@@ -64,16 +64,31 @@ DOCUMENTATION:
year = {2013}, year = {2013},
} }
If you use Ishioka's recurrence (the default since SHTns v3.4), you may also want to cite his paper:
@article {ishioka2018,
author={Ishioka, Keiichi},
title={A New Recurrence Formula for Efficient Computation of Spherical Harmonic Transform},
journal={Journal of the Meteorological Society of Japan},
doi = {10.2151/jmsj.2018-019}, volume={96}, number={2}, pages={241--249},
year={2018},
}
CHANGE LOG: CHANGE LOG:
----------- -----------
* v3.4 * v3.4 (10 Jun 2020)
- Change in API/ABI (`shtns.h`, `shtns.f03`): removal of lmidx array and new nlat_padded member - Change in API/ABI (`shtns.h`, `shtns.f03`): removal of `lmidx` array and new `nlat_padded` member
in shtns_cfg structure; function names unchanged. in `shtns_cfg` structure; function names and signatures remain unchanged.
- Ishioka's recurrence is now the default. - Ishioka's recurrence is now the default (faster).
- Improved performance, especially for small transforms (5 to 35% faster). - Further performance improvements, especially for small transforms (5 to 35% faster).
- Further performance improvements can be enabled with the new `SHT_ALLOW_PADDING` flag (1 to 50%), - Even more performance improvements can be enabled with the new `SHT_ALLOW_PADDING` flag (1 to 50%),
especially on KNL. especially on KNL.
- Regardless of the CC variable, gcc is now used by default for kernels (faster).
Use `--enable-kernel-compiler=` to override.
- Bugfixes in the shallow water examples, thanks to M. Schreiber.
- New FAQ in the docs.
* v3.3.1 (25 Sep 2019) * v3.3.1 (25 Sep 2019)
- Different name for openmp and non-openmp version of shtns library for KNL. - Different name for openmp and non-openmp version of shtns library for KNL.
......
...@@ -174,6 +174,10 @@ In this layout, increasing latitude are stored next to each other for each longi ...@@ -174,6 +174,10 @@ In this layout, increasing latitude are stored next to each other for each longi
That is \f$ A(\theta,\phi) \f$ = \c A[ip*NLAT + it] in C or \c A(it,ip) in Fortran. That is \f$ A(\theta,\phi) \f$ = \c A[ip*NLAT + it] in C or \c A(it,ip) in Fortran.
Use \ref SHT_THETA_CONTIGUOUS to instruct \ref shtns_init to use this spatial data layout. Use \ref SHT_THETA_CONTIGUOUS to instruct \ref shtns_init to use this spatial data layout.
Additionally, \ref SHT_ALLOW_PADDING instructs to shtns to optimize the layout to avoid cache bank conflicts.
This can lead to significant performance boost (from 1% to 50% depending on the architecture).
In that case, shtns_info#nlat_padded > shtns_inf#nlat and shtns_info#nspat > shtns_info#nlat * shtns_info*nphi to reflect the data layout.
\section native Native layout \section native Native layout
The native way of storing spatial field data (which will help you achieve the best performance with SHTns) The native way of storing spatial field data (which will help you achieve the best performance with SHTns)
......
#!/bin/bash #!/bin/bash
# script to test many sht cases # script to test many sht cases
id=`hg id` #id=`git branch | sed -e '/^[^*]/d' -e 's/* //'`
id=`git rev-parse HEAD`
log="test_suite.log" log="test_suite.log"
function test1 { function test1 {
......
Supports Markdown
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment