Recommended data transfer method
IMPORTANT - Read this carefully before going on
Never, ever copy data from any of the
instrument computers (tessa, marissa, etc) or from The data computer
(at this moment, elena). The usual account that is used in those
computers has write permissions over the data, and an
error typing the copy instructions may result in data
corruption/destruction.
It cannot be stressed enough: this is not a "probable"
thing; it has happened in the past.
Data can be accessed safely through florence, which we
use already for online data reduction (eg., NOTCam quicklook, etc). This
machine has a /data directory, showing the same
as the instrument computers, but it is mounted with read-only
permissions, preventing any accidental harm.
NB: copying data can take quite a lot of time
over WiFi, which is limited (in our current configuration) to a maximum
54Mbps. The preferred way to connect to the network when
transferring such big amount of data is using the Ethernet cables.
NB2: all the following text assumes you're copying the
data into your personal computer, typing commands from that
computer. Of course, you can always copy the data from florence
to your home computer (say, at your institution), if you have access;
to do this, just adapt the commands.
Method 1 - rsync
By far, the best method (if the observers can used it) is rsync
transfer. This program comes installed in many Linux distributions (or can
be installed easily) and has a number of advantages, including exact
reproduction of the data and metadata (eg. timestamps of the files), and
in the event of an interruption (network problems, or the person performing
the copy cancels it for any reason), the rsync command can be
reissued and will copy things were it left, not from the beginning.
rsync will use ssh to access the remote location; so,
if you have access to the origin of the data over ssh and
rsync is installed in that computer, too (our case), you can use
it.
There are graphical frontends for rsync (like
Grsync, available for the most
common operating systems), but we'll cover here only the command line
version.
A brief explanation of the command:
- -va
- two flags asking rsync to be
verbose (it will print each file as it's done copying
it), and to work in archive mode (recursively copy
directories, try to set timestamps, permissions, etc. to try and end
up with a structure identical to the one in the original site)
- guest@florence:/data/alfosc/ALwg06*.fits
- is the origin of the
data. "guest@florence" is the user and machine where the data is
residing; then the full path to the data. Notice that you can include
wilcards ("*", "?", etc). If the wilcards don't make sense to the local
computer, they will be transmitted to the other computer, which will
compute them.
Note that we've quoted the origin; it shouldn't be
needed, but in case you're using csh it
is, because csh will present an error in case there
is some wilcard it cannot expand succesfully; while bash and
other better behaved shells will just pass them unaltered to the command
in case they cannot do anything.
- .
- (a dot) the destination for the data. In this case we've specified the current
directory. Of course, we could have copied the data in one go, like
this:
$ rsync -va "guest@florence:/data/alfosc/ALwg06*.fits" ~/data/NOT/20130706
without the need to move first to the destination directory.
NB: of course, the origin directory will depend on the
instrument and the night's prefix. Also, if the data has been archived by
the time you're going to perform the copy, it may have been moved deeper
into other subdirectory. Ask your support in case of doubt.
Method 2 - SCP/SFTP
If you can use SSH, but don't want/can't use rsync, you can
still copy the data using the usual SSH methods (SCP or SFTP). eg.
$ scp guest@florence:/data/alfosc/ALwg06*.fits ~/data/NOT/20130706/
Method 3 - FTP
If for any reason you cannot use other methods, florence
runs an anonymous FTP server. Of course, this FTP server can only be
reached from within NOT's network
|