OpenBCM V1.08-5-g2f4a (Linux)

Packet Radio Mailbox

IZ3LSV

[San Dona' di P. JN]

 Login: GUEST





  
LW1DSE > TECH     31.12.17 02:02l 407 Lines 19874 Bytes #999 (0) @ WW
BID : 1127-LW1DSE
Read: GUEST
Subj: Long File Names on DOS 7.10
Path: IZ3LSV<I0OJJ<GB7CIP<LU4ECL<LU4ADN<LU7DQP
Sent: 171231/0006Z @:LU7DQP.#LAN.BA.ARG.SOAM #:26032 [Lanus Oeste] FBB7.00i
From: LW1DSE@LU7DQP.#LAN.BA.ARG.SOAM
To  : TECH@WW


[――― TST HOST 1.43c, UTC diff:5, Local time: Mon Nov 20 18:52:49 2017 ®®®]

                            Long File Names
                            ---------------

        Long File Names (LFNs) were introduced in the original Windows 95,
and remain a significant compatibility issue to this day.

        But to understand LFNs, you have to understand how normal file names
work! A good learning tool for this purpose is "DirSnoop", which can be
downloaded from the Internet. This views directory entries in their raw form,
and it can be fun to associate DirSnoop as a non-default right-click action
for "File Folder" (Win9x-speak for "directory").

        The internal structure of LFNs is more rigorously documented else-
where on the Internet; the focus here will be on practical things that go
wrong with LFNs and how to anticipate and manage these. It's also crucial not
to get confused between LFNs and other Win9x file system compatibility
issues, such as FAT32 or VFAT (e.g. DOS Compatibility Mode).

8.3 names

        Every file system data object is pointed to by a directory entry of
32 bytes in length, which contain information about the file; name, address
of the first cluster (if any), length of the file in bytes, time and date
stamps, and a set of attribute bits that determine whether the file was
recently backed up, should be displayed, may be written to, etc.

        But not every data object within the file system is a file - some of
the attribute bits are used to signify subdirectories or volume labels. There
should only be one volume label per disk volume, found in the root; however,
subdirectories may abound. A subdirectory is stored like a file, except that
the data clusters of this "file" are interpreted as further lists of file
system objects.

        There are 11 bytes set aside for the object name in a directory entry.
All 11 can be used for volume labels, but in the case of files and (sub)
directories, the 11 bytes are interpreted as 8 bytes of name, and 3 bytes of
extension (the "three letters after the dot" that determine what should be
done with the file). This pattern of 8 name characters plus a 3 character
extension is referred to as the 8.3 naming convention, which also excludes
spaces, certain other characters, and uses only upper case letters internally.
It's quite a restrictive convention to live with, if you want to use meanin-
gful names; hence the challenge to add long file name support while remaining
compatible with programs bound to 8.3 names and conventions.

Long File Names

        LFNs are stored as directory entries with an "impossible" combination
of attribute bits - thus causing them to be safely ignored by most software
written before Windows 95. Like volume labels, they point to no data clusters;
in fact, they are pure character data, with no other information. Each char
takes two bytes of space, so that LFNs up to 16 characters can be held within
a single directory entry; longer names take up additional entries.

        Spaces can be used, but some characters are still reserved for use as
delimiters or redirection symbols. Both upper and lower case letters can be
used, although the case is used only for display within the Win9x system. An
LFN is generated whenever a file name that is invalid as 8.3 and isn't in
ALLCAPS is created within Windows through the Win32 API. File names that are
8.3-valid, but not in ALLCAPS, have a "cosmetic" LFN to preserve the letter
case for display - something that becomes important when uploading to
UNIX-based servers or web sites.

        It's important to remember that the LFN is used in addition to the
8.3 name within the Win9x file system, and is associated with it by virtue of
its position in the directory. All other information about the file (length,
cluster address, etc.) is stored within the 8.3-named entry. When the file is
stored in contexts other than the Win9x file system, this may not be the
case; for example, only one name is used within a .zip archive, and the other
name is lost.

        Other file systems handle longer names (with or without spaces)
natively, but may have limitations of their own, e.g. a file system used on
CD-ROM may not accept names that start with a space. But few file systems
support or store alternate names for the same object.

LFN Risks and Issues

        These include software compatibility issues, data corruption risks,
slight performance degradation, and problems with command line parameter
management:

    * Pre-LFN disk utilities
    * Pre-LFN software
    * DOS and DOS Mode
    * LFNs used as 8.3 data
    * Numeric Tails
    * Ambiguous LFNs
    * Ambiguous name display
    * False extensions
    * LFN bloat
    * Win9x internals
    * Non-LFN volumes in Win9x
    * Parameter management

Pre-LFN disk utilities

        Whereas "normal" pre-LFN programs will ignore LFNs, some programs
that are written to manage or repair the details of the file system will trip
over them. File system repair utilities such as MS-DOS 6.x Scandisk, pre-LFN
Norton Disk Doctor, etc. will flag them as illegal entries (as indeed they
are, within pre-LFN file systems) and may delete them; pre-LFN "directory
sort" utilities will move them away from the real 8.3 named entries they are
supposed to be associated with, as will pre-LFN defraggers such as MS-DOS 6.x
Defrag and pre-LFN Norton Speed Disk.

        While the file is still intact and accessible via the underlying 8.3
name, problems arise when inter-file links are broken or when it's not
possible to guess the identity of the file (e.g. which "Invoice *.doc" is
INVOIC~9.DOC ?). Windows uses LFNs internally (crazy, but true) so that the
OS may not be able to run if these are mangled.

Pre-LFN software

        Pre-LFN software won't see LFNs; instead, the raw underlying 8.3
names are seen and used. This can create problems if such files are copied,
moved or renamed; changing the 8.3 name has the effect of blowing away the
LFN associated with it. Under these circumstances, a rename will cause the
LFN to change to the new 8.3 name; a copy or move will simply leave it behind.

        This difference is useful when one needs to deliberately strip away
LFNs, either for performance reasons or for use on disks that are destined
to be maintained by pre-LFN environments.

DOS and DOS Mode

        Anything prior to WinStart.bat within the startup axis can't see LFNs,
and neither can Win9x DOS mode or previous versions of MS-DOS.

        The only exception to this are utilities that access the directory
entries directly, rather than relying on API calls. Examples include the
rather hairy Microsoft LFNBack.exe that is on most Win9x CDs, and the safer
and better 3rd-party downloadable freeware DOSLFNBk.exe; both of these can be
used to backup LFNs from within DOS mode.

LFNs used as 8.3 name data

        There's a need for DOS Mode archivers of backup utilities that can
"see" LFNs, but you can go seriously wrong here. For example, the freeware
InfoZip might appear just the ticket as it sees LFNs and runs in both Windows
and DOS Mode, and the .zip file standard itself is quite comfortable with
LFNs.

        However, if you create a .zip such that this contains LFNs, and then
extract it in a non-LFN-aware environment (such as PKUnZip 2.04g, or InfoZip
in DOS Mode), you will get neither the original 8.3 name (which wasn't stored
in the archive) nor the LFN. What you will get is an 8.3 name taken as-is
from the LFN name data; a name that may contain lower case letters, spaces,
or other illegal characters. For example, the name "My Documents" might be
extracted as "My Docum.ent", and as that space is within the actual 8.3 name
itself, it will cause problems in both DOS Mode and Windows.

        For this reason I prefer to use PKZip 2.50 (PKZip25.exe) rather than
InfoZip, as it will refuse to operate in DOS or DOS Mode.

Numeric Tails

        One of two Great Discredited Registry Hacks (the other being killing
off IsShortcut as a way of hiding icon shortcut arrows) involved a setting
that changed the way 8.3 names were generated from LFNs when creating names.

        The LFN-to-8.3 naming method is:

* Spaces are stripped, then the first 6 characters are used as the name stub,
  followed by a tilde (~ or "squiggle") and next required digit {1,2,3...},
  then a dot (not stored internally) and the first 3 characters following the
  last dot in the LFN. The digit chosen will be the lowest that avoids a
  same-name clash with an 8.3 name already present in that directory; if all
  digits 1-9 are taken, the name stub is shortened and the number takes an
  extra digit to the left (i.e. ...NEXTIS~8.EXT, NEXTIS~9.EXT,
  NEXTI~10.EXT...)

Examples:

"A Long File Name.a.b.c.d" -> ALONGF~1.D
"My Safe Picture.gif.exe" -> MYSAFE~1.EXE
"My Safe Picture.gif.executable" -> MYSAFE~2.EXE

        If the registry setting to disable numeric tales was added, these
names would be created differently:

"A Long File Name.a.b.c.d" -> ALONGFIL.D
"My Safe Picture.gif.exe" -> MYSAFEPI.EXE
"My Safe Picture.gif.executable" -> MYSAFE~1.EXE

        Where you have ambiguity like this, you no longer have a function
(i.e. a process that generates only one result) and situations arise where
"behavior is undefined".

Ambiguous LFNs

        Any LFNs that have the same first six non-space characters and
extension will generate the same name stub. The numeric tail is generated by
enumeration rather than identity, so it is a matter of what order they were
created in that determines which is called what. For example, consider this:

"Invoice 001.doc" -> INVOIC~1.DOC
"Invoice 002.doc" -> INVOIC~2.DOC
"Invoice 003.doc" -> INVOIC~3.DOC
"Invoice 004.doc" -> INVOIC~4.DOC

        Now, "Invoice 003.doc" is deleted. Will "Invoice 005.doc" be
INVOIC~3.DOC, or INVOIC~5.DOC? If these files are linked to from an Excel 6
spreadsheet (which is LFN-unaware), will the link to "Invoice 003.doc" now
point to "Invoice 005.doc"? If this directory is backed up, and then restored
elsewhere with the files created out of sequence (or into a dir that already
has several "Invoice ???.doc" files present) will that spreadsheet's links be
anything remotely sane?

        Moral: Don't use same six characters for lots of file names; it also
bedevils data recovery!

        Life would be a lot cleaner and simpler had Microsoft called
"Program Files" and "My Documents" "Programs" and "Docs" respectively. That
avoids both parsing issues and the problem of ambiguous 8.3 names. For
example, if you were to backup in this order...

"C:\Program Files" -> PROGRA~1
"C:\Program Fools" -> PROGRA~2

...and restored these in this order...

"C:\Program Fools" -> PROGRA~1
"C:\Program Files" -> PROGRA~2

...then apps that refer to their paths via 8.3 names (e.g. anything involving
.inf handler, AutoExec.bat Path, etc.) will get the wrong directory and won't
work.

        This isn't such an unlikely scenario as it sounds; it's common to
rename away a "Program Files" when doing a parallel Windows installation, and
if you'd renamed it to "Program Files old" rather than, say, "ex-Program
Files", your new installation might track PROGRA~2 instead of PROGRA~1. That
might cause problems when you try and integrate the two into one working
installation.

        This enumeration-vs.-identity dilemma is a basic info-theory boo-boo
that recurs in Plug-n-Play and drive letter management. It is one cause of
problems that can arise after restoring Windows-based backups; the others
being files that were open or in a dynamic state when the backup was made,
and inter-file inconsistencies due to processes running within the backup
period.

        The only way to really backup all the information within the file
system (while regenerating actual cluster positioning) is to back everything
up outside of Windows, and use a separate process to backup the LFNs (e.g.
DOSLFNBk.exe). Most of the time it works out OK, as long as you don't have
hybrid LFN/8.3 access to similarly-named data files.

Ambiguous LFN display

        There are certain characters that can be valid within LFNs, but ll
not be displayed by the Windows interface (i.e. Explorer.exe in its various
guises). Typically the trouble character is shown as an underscore ("_")
character.

        Suspect this if you have what appears to be two entries with the same
name (having excluded a .pif or .lnk etc.) or a file that you can't seem to
"get a grip on" ("not found" errors when trying to access or delete it).

        As long as the entire directory is not deranged (genuine same-name
entries are quite common in an insane file system) and the rest of the file's
directory information is sane (i.e. not a 57G file on a 2G drive) then you
can use the DOS wildcard approach to rename or remove it.

        You may also have to do this in the case of invalid 8.3 names gene-
rated by processing LFN name data outside an LFN-aware environment.

False extensions

        This has significant safety implications, and is already exploited by
malware. Because you can have as many dots within an LFN (the dot is a valid
character under LFN rules, though not under 8.3 rules), you can get misle-
ading names such as:

"LifeStages.txt.vbs"
"My Safe Picture.GIF.pif"
"Zipped_Files.zip.exe"

        Couple this with the Microsoft default practice of hiding file
extensions for registered file types, and you have a recipe for disaster. An
.exe can have any icon embedded with it, so it's trivial to create a
Zipped_Files.exe with a WinZip icon within - looking just like a "safe"
archive.

        The problem is compounded when certain dangerous extensions are
hidden, regardless of how you set up Explorer; .shs, .shb, .lnk and .pif are
all dangerous file types that fall into this category. Part of the problem
can be managed by renaming away SHSCrap.dll so that .shs and .shb files can't
be processed by the system.

        In this respect, Windows Millennium exacerbates the problem by making
it more difficult to rename away system files (SFP replaces them on the fly)
and by losing the facility to display the real 8.3 name via the file's
Properties.

LFN bloat

        Each directory sector can hold 16 entries, or 8 entries if all names
have fairly short LFNs. Every subdirectory starts with two entries for the .
(self) and .. (parent) pointers. A FAT32 volume under 8G in size will use 4k
clusters by default, i.e. can hold 126 directory entries (or 63 with short
LFNs) before having to link in additional clusters and thus potentially
become fragmented.

        But directories can often hold thousands of files, so fragmentation
and slowdown are common. Fragmentation not only impacts performance, but
increases the size of the double-zero on the dartboard (the time during which
a crash will interrupt a file write operation and thus cause data corruption).

        This is probably one of the reasons why users complain about "My
Documents" taking long to "open", and is a reason why one gets fed up with
programmers who blithely create thousands of temp files with lower-case names
that generate cosmetic LFNs and thus double the bloat factor.

        You can imagine the slowdown when processes have to create new,
arbitrary-but-unique named temp files at the end of a 20-cluster fragmented
mess of a temp directory.

        Other scenarios where directory length (rather than file load) cause
slowdowns are the case of a software botch that causes masses of zero-length
.inf files to be spawned, and the Prolin/Creative virus that moves wads of
.jpg and .zip files to the root of the C: volume. The latter causes slowdown
on FAT32 volumes, and oddball errors on FAT16 volumes (as FAT16 has a fixed
limit on the number of entries the root directory can hold).

        The other form of LFN bloat is nonsense like "C:\Program Files\Micro-
soft\Common Files\Office\Microsoft Common Files\Some Common Files for MS
Office\Version 10\Standard edition\Shared\Blob.dll". These gratuitously long
paths break several backup utilities, CD file system standards, Path
environment and Command.com parameter space restrictions, and are thrown up
as errors by ScanDisk in DOS Mode.

        Consider this if you were wondering why "clean up" batch files with
lines like 'Del "C:\Windows\Application Data\Microsoft\Internet Explorer\
Quick Launch\Launch Outlook Express.lnk"' don't seem to work.

Win9x internals

        Strangely, some fresh-for-Win95 parts of Windows 9x are not LFN aware.
A classic example in the .inf handler that is used when installing hardware
drivers and so forth; it cannot list or find LFNs, and will often throw up a
"where-is-it?" browse dialog when trying to access stuff that you'd pointed
it to just one mouse click and dialog ago.

        This hasn't got better, right up to Windows ME. It's odd, because the
PnP and .inf handler is new to Win9x; it's not a legacy thing brought over
from Win3.yuk - presumably it was one of the first things they developed and
stabilized before kludging on LFN support.

Non-LFN volumes in Win9x

        Sometimes users report they are unable to copy LFNs onto a particular
hard drive volume; all they see there are 8.3 names. I've never seen this,
but then I always do my FDisking and Formatting in DOS Mode.

        I've read that this can happen if you do these actions within Windows,
and then start working on the new volume while in the same Windows session;
the problem goes away after restarting Windows.

        You may also see this if you explicitly disable LFN support within
System, Performance, File System, Troubleshooting for some reason. DOS
Compatibility Mode and Safe Mode won't do this, however.

Parameter management

        The space character is used as a parameter delimiter by Command.com,
and the parameter parsing logic of many programs. This is countered by
enclosing LFNs with spaces in quotes, so that whereas My Proctologist would
be seen as two parameters, "My Proctologist" is seen as one.

        Command line processors may add quotes, or not, and parser logic may
strip quotes, or not. For example, you need to add an explicit "%1" to the
command line for LView Pro, else it won't see associated files if there's a
space in the file spec, but if you do that for IView, it won't see anything.

        Consider this the likely problem when you have "unable to run
Program" (for a reference to "C:\Program Files\SomePath\SomeApp.exe") errors
on startup, or starting an application, or launching a file.

        Consider this also when you see "can't find xxx" errors when the file
you are trying to "open" happens to be in a directory with a space in it, or
itself has a name with a space in it.

        Somewhere; either in a shortcut, an .ini file, or within
HKEY_CLASSES_ROOT, you will see a %1 that should be "%1", or a command line
like "C:\Program Files\SomePath\SomeApp.exe" that needs an extra set of
quotes. Exported .reg files have quotes around string values, so explicit
quotes appear as "doubled" there. HKEY_CLASSES_ROOT takes the first parameter
as %1, so adding an explicit "%1" via the "front door" has the effect of
enquoting the auto-generated %1 parameter.

Makes you really wish they'd called it C:\PROGRAMS, doesn't it?

(C) Chris Quirke, all rights reserved - January 2001
ΙΝΝΝΝΝΝΝΝΝΝΝΝΝΝΝΝΝΝΝΝΝΝΝΝΝΝΝΝΝΝΝΝΝΝΝΝΝΝΝΝΝΝΝΝΝΝΝΝΝΝΝΝΝΝΝΝΝΝΝΝΝΝΝΝΝΝΝΝΝΝΝΝΝΝΝΝ»       
Ί Osvaldo F. Zappacosta. Barrio Garay (GF05tg) Alte. Brown, Bs As, Argentina.Ί
Ί Mother UMC ζPC:AMD486@120MHz 32MbRAM HD SCSI 8.4Gb MSDOS 7.10 TSTHOST1.43C Ί
Ί               6 celdas 2V 150AH. 18 paneles solares 10W.                   Ί
Ί                  lw1dse@yahoo.com ; lw1dse@gmail.com                       Ί
ΘΝΝΝΝΝΝΝΝΝΝΝΝΝΝΝΝΝΝΝΝΝΝΝΝΝΝΝΝΝΝΝΝΝΝΝΝΝΝΝΝΝΝΝΝΝΝΝΝΝΝΝΝΝΝΝΝΝΝΝΝΝΝΝΝΝΝΝΝΝΝΝΝΝΝΝΝΌ


Read previous mail | Read next mail


 15.01.2025 10:45:36lGo back Go up