Page History

...

we were not 16-byte aligning cuDoubleComplex variables. error showed up "much later" in cuGetVector (error 11) and cudaDeviceSynchronize (error 4). Did binary search to find source of error. How do we program error-checks so that run-time errors show up "immediately"? cuda_safe_call?
- pattern for error checking: issue different kernels in different streams, then do cudastreamsynchronize and cudagetlasterror
if we have 1 number used by many threads should it go into shared memory? constant memory? we would think constant memory would be the right answer. shared memory would give a bank conflict.
cuda-gdb generates output for kernel launches. slows down the code dramatically? becomes unusable.
- set flag "set kernel notification none"
- submit bug report if not solved
understand crash with rpa-gpu-expt running rpa_only_Na_cuda.py with nvprof
- should file a bug report
how does cuda deal with memory fragmentation?
nvvp error: "102 metrics have invalid values due to inconsistencies in the required event values"
double complex math: really fp64 instructions?
talk to Gernot Ziegler about instruction limited kernels?
is our zherk kernel latency limited?
cufftplanmany memory leak
trigger crash on nan? how do nan's get produced?
cuda memcheck same as valgrind?
get many errors from cublas with race check
- if really errors: submit bug report
what memory access errors can memcheck detect? cudamemcpy? array-out-of-bounds?
- doesn't detect cudamemcpy errors (or any errors by the host) but does detect array-out-of-bounds accesses within the GPU

3/12/2013

look profiling on RPA (lin)
ask about error handling at GTC (lin)
base.py get_phi_agp kernel
rpa manuscript (jun)
k-point parallelization (cpo)

...

Child pages

Versions Compared

Old Version 184

New Version 185

Key

3/12/2013