This is a static archive of the previous Open Grid Forum Redmine content management system saved from host redmine.ogf.org file /dmsf_files/9297?download=14380 at Fri, 04 Nov 2022 19:55:47 GMT GridFTP v1.0 improvements

List of suggested GridFTP v1.0 improvements

Uni-directional data transfer in E-block mode
As it is described in GridFTP v1.0 draft document, due to possible race condition, in E-block mode, data connection establishment must always originate at the data source side. This means that data retrieval must always be performed in server-active mode and data storage - in server-passive mode. This makes it hard to use E-block mode in presence of firewalls, NAT, etc.

Possible solutions to this problem are: - negotiation of number of parallel data connections ahead of time - modifications of E-block mode protocol

Ordering of PASV/SPAS and STOR/RETR commands
This problem is inherited from RFC969 FTP protocol. The thing is that in passive mode, server must reply to PASV command with address of data socket before it receives STOR or RETR command. In case of distributed server, this is not always possible. This makes it very difficult to implement passive mode in distributed FTP server in scalable way.

There are many possible solutions of this problem:

  1. Introduction of PRET command as discussed in GridFTP v1.0 document which would carry attributes of the file which is about to be transferred and issuing PASV right after PRET
  2. Delayed passive option for PASV command which would allow to defer answer to the PASV command until STOR/RETR is received and include data socket address into (unused) answer to STOR/RETR
  3. Introduction of new pair of commands, GET and PUT which would essentially combine functionalities of (1) and (2) into single command/reply and presere semantics of STOR/RETR
Possible disconnection of idle control channel socket by some firewalls
This problem is inherited from RFC969 FTP protocol. Some firewall software drops idle TCP connections. In some applications, such as disk cashe in front of tape storage, data existing in name space is not always readily accessible. In these cases, the client must wait for relatively long time before data transfer can even start after issuing the STOR/RETR command. This makes the control channel socket connection idle for long time, and the firewall can drop it. The same may apply to data channel as well.

Solution for control channel seems to be easier than for data channel. Performance data proposed in GridFTP v1.0 draft can be periodically sent over the control channel to keep it alive. As for data channel, some sort of keep-alive noise could be sent in the direction opposite to data transfer.

Unreliable EOF communication in Stream mode
This is inherited from RFC969 FTP protocol. As specified by the RFC, during data upload, the server is supposed to treat end of data socket as end of file. This makes it impossible for the server to distinguish between normal end of file and abnormal client shutdown.

Possible solution for this problem is to introduce mandatory EOF command which would be sent by the client to confirm that the entire file was sent successfully over the data socket.

Data integrity verification
In order to protect data from transmission errors, some data integrity verification mechanism should be introduced on the level of FTP protocol, in addition to TCP packet checksumms. Some sort of CRC or another form of digital signature should be calculated either over each block of data in block-oriented transfer modes such as Block and Extended Block or over whole file in Stream mode.
Control of types and frequency of feedback from the server
Currently, the performance markers come at a fixed 5 second interval and restart markers come at a predefined block size. We should allow the interval between restart and performance markers to be set. Optionally, we may wish to allow this to be extensible so that other transfer event data could be returned as well, for example if the end host was a mass storage system and it were staging a file, it might send back ETA or % done markers.

Possible solutions would be to use FEAT/OPTS mechanism or introduction of new command (tentatively TREV for or TRansfer EVent).

Structured directory listings
Incorporate the MLST and MLSD commands as specified in Extensions to FTP Internet Draft. Investigate structured stat command.
IPV6 support
Consider endorsement of the EPRT and EPSV commands as defined by RFC 2428
This is a static archive of the previous Open Grid Forum Redmine content management system saved from host redmine.ogf.org file /dmsf_files/9297?download=14380 at Fri, 04 Nov 2022 19:55:47 GMT