Duplicate File Finder User Manual: Future Plans

Duplicate File Finder User Manual

Previous: Known Deficiencies Contents Next: Java 1.4.x Deficiencies

Part 7:
Future Plans for Duplicate File Finder

I have significant future plans for Duplicate File Finder. These include in no particular order:

Inclusion of a Bad Messages window. This is for if Duplicate File Finder is unable to find or read a file for example.
Include an option to generate file checksums only if file sizes are identical.
Include an option limiting file searching to files that fall within a user specified file size range (minimum and maximum file sizes).
Exclusion of directories or files from a search.
Filter to examine or exclude only files with certain extensions (very windows like though).
Have a switchable default to ignore directories named "cache".
Detection of if full directories match each other.
Removal of files from lists without deletion.
Moving found duplicate files to another location.
Replace a duplicated file with a link to another identical, existing file.
Move a duplicated file to the users trash directory.
Ability to mark files in multiple groups for a later single step processing (delete, move, ...)
Recognition of problem file names.
Recognition of broken links. This is already kind of implemented, but you have to run Duplicate File Finder from the command line to get a hint that there might be a broken link.
An option to view a file in its normal application before processing it.
With image files, create a list of files, either automatically or a user defined list, and view all those files in the configured external viewer.
Addition of an auto delete/move function based on certain customisable rules. These rules would be for duplicate files in the same directory, and to be able to specify that files in certain specified directories have duplicates in other specified directories, the files in the other directories should be automatically deleted/moved, and deleting the oldest or newest files.
Be able to specify directories in which duplicate files should not be deleted.
Defining if the auto delete function should really delete or just move files to the users trash directory.
A protection mechanism to ensure you don't accidentially delete all files in a duplicated group leaving you with no copy of a file which had duplicates.
A protection mechanism to ensure files in system directories (/bin, /etc, ...) are not deleted. This mechanism is primarily for those who insist on running applications as root.
Option to save the current list of found duplicates so processing can continue at a later date.
Option to export the found list of duplicates to a text file (.txt), or a comma delimited file (.csv) for import into a spreadsheet program.
Making use of a real database for the storage of file information (DB2?).
If information about files on a removable media is to be stored in a database, the inclusion of a name of the media, for example the name of the CD the files are stored on, in the database.
Recognition of images that are different in their binary make up, but similar to the eye. I have no experience in this area, so any help is welcome.
Recognition of audio files, particularly mp3 and ogg vorbis, that are different in their binary make up but similar to the ear. As with images, I have no experience in this area so any help is welcome.
Possible recognition of files that would be the same but where one file is truncated.
Generation of a final report of what was found and what was done. This would include a count of the total number of files and directories examined, the total number of duplicates found, the total number and names of files deleted, moved, or linked, files that could not be deleted or otherwise processed, the total number and names of files and directories with problem names. All of this will be sumarised in a list comprising of each summary, and clicking on that list item will expand it to show all the information inside. For example, clicking on the list item with the count of problem file names will expand that list to show the problem file names.
Estimation of how long a search will take. This will be based on what types of searches will be done, how much there is to search in terms of Megabytes, and the CPU bogomips count. If it is estimated a search will take a long than a user configurable warning time limit (default 30 mins), a window advising of this will be displayed.
An interface redesign. Although functional at the moment, it's not the best. Suggestions are welcome.
An option to save the current settings as either a search profile or as a default for when Duplicate File Finder is started. All saved values would be stored in ~/.dff.
Ability to set a scheduled job so that Duplicate File Finder starts at a set time.
Future versions will be written using C++. Java was originally chosen because of the ease of writing in it.
The ability to run Duplicate File Finder as a daemon to check files that are downloaded from the net and show a popup indicating that you have a file that matches your criteria and where that file is. But wait a minute - this is going to extremes!!! This feature, if it is ever implemented, will definately be the last that gets implemented.

Other suggestions and requests are welcome.

Previous: Known Deficiencies Contents Next: Java 1.4.x Deficiencies