sp-announcements mailing list archive
Index Pathological directory/file names on seaborg

From: Thomas M. DeBoni (TMDeBoni_at_lbl_dot_gov)
Date: 12/16/2004

  • Next message: Francesca Verdier: "change in discount for large Seaborg jobs"
    Dear NERSC Users,
    
    In preparation for the planned upgrades to seaborg's filesystems in 
    January,
    all user directories have been scanned for incorrect file and directory
    names. A small number (several hundred) items were found to have 
    unprint-
    able characters (e. g., control characters) in their names. This may be
    important to you because items with ill-formed names
          1) are not included in routine disaster-recovery backups;
          2) can adversely affect calculations of usage against your quotas;
          3) will not survive the movement of the $HOME file systems in 
    January;
          4) may indicate scripts or other programming gone awry.
    
    While many of the pathological names appear to be one-time events (e.g.,
    typing errors), any indicated scripting or programming errors should be
    corrected.
    
    We recommend that you scan your directories for such files using the 
    "ls"
    command with the "-b" option, and examine its output for individual
    characters or three-digit numeric strings that immediately following the
    slash ("\") character. Here are two examples of such files:
    
    The first file list below appears to contain three filenames.
    
    $ ls
    abcdef  pr      st
    
    The second list of the same directory shows there are only two files, 
    each
    with pathological characters in its name; the problem characters are 
    shown
    as "\002" and "\011".
    
    $ ls -b
    abc\002def  pr\011st
    
    The second file contains an embedded TAB character, which makes the name
    print in two separate segments in the absence of the "-b" option to the
    "ls" command.
    
    The long-form list also looks strange.
    
    $ ls -l
    total 0
    -rw-------   1 deboni   mpccc             0 Dec 15 13:54 abcdef
    -rw-------   1 deboni   mpccc             0 Dec 15 13:56 pr     st
    
    But with the "-b" option added, the invisible characters are again 
    shown.
    
    $ ls -lb
    total 0
    -rw-------   1 deboni   mpccc             0 Dec 15 13:54 abc\002def
    -rw-------   1 deboni   mpccc             0 Dec 15 13:56 pr\011st
    
    Files with ill-formed names may be deleted or renamed in a number of 
    ways.
    The easiest option may be to ignore them, and let them disappear with 
    the
    $HOME moves. (Such files in $SCRATCH will not go away, since that file
    system is not being moved.) An easy option for important files or direc-
    tories would be to copy them to properly formed new names. Here's an
    example of this operation:
    
    $ ls -lb ab*
    -rw-------   1 deboni   mpccc            11 Dec 15 14:04 abc\002def
    $ cp ab* abcdef
    $ ls -lb ab*
    -rw-------   1 deboni   mpccc            11 Dec 15 14:04 abc\002def
    -rw-------   1 deboni   mpccc            11 Dec 15 14:05 abcdef
    
    The new file, "abcdef" now contains the same data as the incorrectly
    named file "abc\002def". This method will work only if the source file
    is uniquely identified in the "cp" command. Similar techniques will work
    for the "mv" command, which will work on files and directories; the "rm"
    and "rmdir" commands are also available, but be careful - deleted items
    are gone permanently.
    
    Here's a schema for finding and repairing ill-formed names:
    
    1) Go to your $HOME or $SCRATCH space on seaborg
    2) Generate a recursive list of that space, with the command
        "ls -alRb > AllMyFiles"
        (This step may generate a large file.)
    3) Open the list in a text editor, for instance with the command
        "vi AllMyFiles"
    4) Find each bad file or directory name, for instance with the vi 
    commands
        "/\\" and "n"
        (Note: we expect that perhaps 10% of our users will find some.)
    5) In another seaborg connection window, for each bad name, "cd" to that
        item's parent directory, and change the bad name with some safe 
    command
    6) If you're uncertain about changing names or copying files, contact 
    the
        consultants for help, by email at consult_at_nersc_dot_gov or by phone at
        1-800-666-3772 or 510-486-8611.
    

  • Next message: Francesca Verdier: "change in discount for large Seaborg jobs"