Sanitize File Names Using VBScript
We use an ASP based form to enable our customers to submit files created as part of a registration process. During the submission, the file is sent as an email attachment to our support staff, and a copy is written to a file server as a backup. The application uses a standard Win32 Save dialog that suggests a default filename. In nearly every case, the file is submitted by the customer without changing the default name. If the email process fails, retrieving the backup copy is difficult at best.
I modified the form to create a unique filename using values from certain form elements. Since these values can, and often do, use characters that are invalid as part of a filename, I needed a way to “sanitize” the name during the submission process. I wrote the following VBScript code to replace these characters before attempting to save the file.
Function SanitizeFilename(byVal strFilename, byVal strReplChar) Set objRegExp = New RegExp ' Create new RegExp object ' Define regex pattern and set Global replacement property objRegExp.Pattern = "[\x00-\x1f\x22\\/:\*\?<>\|]" objRegExp.Global = True ' Check strReplChar parameter for invalid length or character, ' default to underscore If Len(strReplChar) = 1 Then strReplChar = objRegExp.Replace(strReplChar, "_") Else strReplChar = "_" End If ' Return clean filename SanitizeFilename = objRegExp.Replace(strFilename, strReplChar) Set objRegExp = Nothing End Function
This function accepts two arguments, strFilename which receives the string created from the form values, and strReplChar which receives the character used for replacement. The function uses a regular expression pattern to match the invalid characters (control characters through 31, double-quotes, and a handful of others). There is a check to make sure that the strReplChar argument is valid, but not much else. The RegExp Object member Replace does the heavy lifting, before the function returns the modified string.
The regex pattern replaces invalid characters for Windows NTFS (generally), but could easily be altered for other OS file systems. You could also use a more restrictive regular expression like:
[^\w\x20\&%'`\-\@{}~!#\(\)&_\^\+,\.=\[\]]
Then again, that may be a bit much. Wikipedia has a good filename reference that includes a fairly comprehensive list of reserved words and allowable character sets. As always, if you need a good regex reference, try Regular-Expressions.info.
UPDATE: The logic in my if/then statement to check the length and validity of the strReplChar was backwards. The example has been corrected.

Awesome post, I did not thought reading it was going to be so great when I klicked at your url.