Skip to content

Neuroimaging Informatics Technology Initiative

Sections
Personal tools
You are here: Home
phorum - NIfTI Message Board - Re: Size limitations in NIfTI
NIfTI Message Board

 New Topic  |  Go to Top  |  Go to Topic  |  Search   Previous Message  |  Next Message 
 Re: Size limitations in NIfTI
Author: Cinly Ooi (---.cmbg.cable.ntl.com)
Date:   12-07-07 18:00

Dear Ged,

Sorry for the late reply, but it appears that i missed the notification of your post

My short reply will be simply to do whichever management style that is best. I would prefer reading fMRI data as a single 4D file for example. Let's deal with spead problems etc once we profile and prove that the read/write routines are the bottlenecks.

For a real life fMRI file, 4Ds are usually relatively OK in size to keep as one file.
Over here our fMRI data size per scan is a few hundred MBs. But if you combine say 12 subjects into one 5D, you are likely to hit the 2GB limit described by Rick, so the operating system will stop you.

As for speed penalty, if you do not access the data serially, then you are likely to use fseek() to jump to the location of choice. If you use non-gzipped Nifti, then your speed penalty will depends on how fast fseek() can jump to a particular location. C specs does not says anything about the complexity of this function, i.e., an upper limit on the required speed. It can be O(constants), O(n), O(log(n)) where n is the number of position to go forward/backward. I don't think it is O(constants), i.e., no difference how far you want to jump ahead/backward. I think it is not as bad as O(n), i.e., the jump time is proportional to the length to jump, but more likely to be O(log(n)), i.e. there will be an increase in time with larger jump size, but not as bad as O(n).

If you are reading compressed NifTI, then there is more uncertainty about the speed of jumping to a particular location. Again gzseek() does not guarantee an upper limit on search time. It is more difficult to predict the time required for jumping because this might requires decompressing the file first (slow), or gzip has a way to of positioning file pointer via a look up table (fast). Again I have no idea what the complexity of the data is like.

>I guess in practise though, you would want to read batches of voxels from disk at a time, and only loop over them in memory, which perhaps makes things more complicated than my intuition above... Any further comments?

You have to loop over voxels in memory, you cannot do it on disk. ;-)

Now, there is two possible implementation:
data4D[N] = /* This is an array of 4D data, this means we have an array of N elements, each elements points to a continuous 4D data. */
continuous4D = /* this is a continuous one D array of 5D data */

to access data in location data5D(m,l,k,j,i) and let
i*width + j * width*height + k * width*height*vol + l * width*height*vol

data4D[N][4Ddata_offset]: we have to do two offset to get to the position of interest, once skipping ahead by N, and another by 4Ddata_offset.

continous4D[m * width*height*vol*series + 4Ddata_offset] : your computer will first have to compute (m * width*height*vol*series), then add 4Ddata_offset to it, and do one offset of size (m * width*height*vol*series + 4Ddata_offset). Mathematical calculation is computationally more expensive then the extra offset in data4D.

So I suppose I will prefer data4D. It is easier to find N continuous data chunk for the 4D data then a single 5D data chunk. The speed of access individual 4D is also faster.

HTH
> Hi Cinly,
>
> Many thanks for your detailed reply! I'm particularly
> interested in this part:
>
> > Given speed penalty penalty, programming penalty (difficult
> to
> > find enough continuous memory to hold full 5D dataset, long
> > time to access particular 4D dataset coz you have a large
> > offset value), [...] likely to be better off storing data as
> N files
> > of (N-1)-D data.
>
> One of my motivations for wanting the data in a single NIfTI
> was that I thought, obviously wrongly, that it would lead to
> quicker access. I assumed this is why (to take one example) the
> randomise program in FSL wants a 4D NIfTI rather than a bunch
> of 3D ones. I guess it depends a lot on how the algorithms are
> coded up, but I would intuitively expect that if you are
> looping over voxels and processing n-vectors over the 4th
> dimension (e.g. as a simple implementation of a voxel-wise GLM
> with n scans might do) then it is quicker to read an n-vector
> as a block down the whole of the 4th dimension of a single
> nifti than to read the data from n separate 3D NIfTIs. Likewise
> (possibly even more so, I assumed) if you want to read n-scans
> by m (multivariate data) matrices at each voxel, it would seem
> that contiguous n*m blocks read from the 5D NIfTI would be
> quicker...
>
> I guess in practise though, you would want to read batches of
> voxels from disk at a time, and only loop over them in memory,
> which perhaps makes things more complicated than my intuition
> above... Any further comments?
>
> Thanks again,
> Ged
>

 Reply To This Message  |  Flat View   Newer Topic  |  Older Topic 

 Topics Author  Date
 Size limitations in NIfTI  new
Ged Ridgway 10-25-07 10:22 
 Re: Size limitations in NIfTI  new
Cinly Ooi 10-25-07 11:20 
 Re: Size limitations in NIfTI  new
Ged Ridgway 10-26-07 10:09 
 Re: Size limitations in NIfTI  
Cinly Ooi 12-07-07 18:00 
 Re: Size limitations in NIfTI  new
rick reynolds 10-25-07 18:49 
 Re: Size limitations in NIfTI  
Davide Imperati 01-28-09 13:17 
 Re: Size limitations in NIfTI  
Cinly OOI 01-28-09 22:23 


 New Topic  |  Go to Top  |  Go to Topic  |  Search 
 Reply To This Message
 Your Name:
 Your E-mail:
 Subject:
   
 

Powered by Plone

phorum.org

This site conforms to the following standards: