Class FileComparer


  • public class FileComparer
    extends Object
    File comparison utility to compare files on the local computer with objects present in a service account and determine whether there are any differences. This utility contains methods to build maps of the contents of the local file system or service account for comparison, and methods to find differences in these maps.

    File comparisons are based primarily on MD5 hashes of the files' contents. If a local file does not match an object in the service with the same name, this utility determine which of the items is newer by comparing the last modified dates.

    Author:
    James Murty
    • Constructor Detail

      • FileComparer

        public FileComparer​(Jets3tProperties jets3tProperties)
        Constructs the class.
        Parameters:
        jets3tProperties - the object containing the properties that will be applied in this class.
    • Method Detail

      • getInstance

        public static FileComparer getInstance​(Jets3tProperties jets3tProperties)
        Parameters:
        jets3tProperties - the object containing the properties that will be applied in the instance.
        Returns:
        a FileComparer instance.
      • getInstance

        public static FileComparer getInstance()
        Returns:
        a FileComparer instance initialized with the default JetS3tProperties object.
      • buildObjectKeyToFilepathMap

        public Map<String,​String> buildObjectKeyToFilepathMap​(File[] fileList,
                                                                    String fileKeyPrefix,
                                                                    boolean includeDirectories)
        Builds a map of files and directories that exist on the local system, where the map keys are the object key names that will be used for the files in a remote storage service, and the map values are absolute paths (Strings) to that file in the local file system. The entire local file hierarchy within the given set of files and directories is traversed (i.e. sub-directories are included.)

        A file/directory hierarchy is represented using '/' delimiter characters in object key names.

        Any file or directory matching a path in a .jets3t-ignore file will be ignored.

        Parameters:
        fileList - the set of files and directories to include in the file map.
        fileKeyPrefix - A prefix added to each file path key in the map, e.g. the name of the root directory the files belong to. If provided, a '/' suffix is always added to the end of the prefix. If null or empty, no prefix is used.
        includeDirectories - If true all directories, including empty ones, will be included in the Map. These directories will be mere place-holder objects with a trailing slash (/) character in the name and the content type Mimetypes.MIMETYPE_BINARY_OCTET_STREAM. If this variable is false directory objects will not be included in the Map, and it will not be possible to store empty directories in the service.
        Returns:
        a Map of file path keys to File objects.
      • listObjectsThreaded

        public StorageObject[] listObjectsThreaded​(StorageService service,
                                                   String bucketName,
                                                   String targetPath,
                                                   String delimiter,
                                                   int toDepth)
                                            throws ServiceException
        Lists the objects in a bucket using a partitioning technique to divide the object namespace into separate partitions that can be listed by multiple simultaneous threads. This method divides the object namespace using the given delimiter, traverses this space up to the specified depth to identify prefix names for multiple "partitions", and then lists the objects in each partition. It returns the complete list of objects in the bucket path.

        This partitioning technique will work best for buckets with many objects that are divided into a number of virtual subdirectories of roughly equal size.

        Parameters:
        service - the service object that will be used to perform listing requests.
        bucketName - the name of the bucket whose contents will be listed.
        targetPath - a root path within the bucket to be listed. If this parameter is null, all the bucket's objects will be listed. Otherwise, only the objects below the virtual path specified will be listed.
        delimiter - the delimiter string used to identify virtual subdirectory partitions in a bucket. If this parameter is null, or it has a value that is not present in your object names, no partitioning will take place.
        toDepth - the number of delimiter levels this method will traverse to identify subdirectory partions. If this value is zero, no partitioning will take place.
        Returns:
        the list of objects under the target path in the bucket.
        Throws:
        ServiceException
      • listObjectsThreaded

        public StorageObject[] listObjectsThreaded​(StorageService service,
                                                   String bucketName,
                                                   String targetPath)
                                            throws ServiceException
        Lists the objects in a bucket using a partitioning technique to divide the object namespace into separate partitions that can be listed by multiple simultaneous threads. This method divides the object namespace using the given delimiter, traverses this space up to the specified depth to identify prefix names for multiple "partitions", and then lists the objects in each partition. It returns the complete list of objects in the bucket path.

        This partitioning technique will work best for buckets with many objects that are divided into a number of virtual subdirectories of roughly equal size.

        The delimiter and depth properties that define how this method will partition the bucket's namespace are set in the jets3t.properties file with the setting: filecomparer.bucket-listing.<bucketname>=<delim>,<depth>
        For example: filecomparer.bucket-listing.my-bucket=/,2

        Parameters:
        service - the service object that will be used to perform listing requests.
        bucketName - the name of the bucket whose contents will be listed.
        targetPath - a root path within the bucket to be listed. If this parameter is null, all the bucket's objects will be listed. Otherwise, only the objects below the virtual path specified will be listed.
        Returns:
        the list of objects under the target path in the bucket.
        Throws:
        ServiceException
      • lookupObjectMetadataForPotentialClashes

        public Map<String,​StorageObject> lookupObjectMetadataForPotentialClashes​(StorageService service,
                                                                                       String bucketName,
                                                                                       String targetPath,
                                                                                       StorageObject[] objectsWithoutMetadata,
                                                                                       Map<String,​String> objectKeyToFilepathMap,
                                                                                       BytesProgressWatcher progressWatcher,
                                                                                       StorageServiceEventListener eventListener)
                                                                                throws ServiceException
        Given a set of storage objects for which only minimal information is available, retrieve metadata information for any objects that potentially clash with local files. An object is considered a potential clash when it has the same object key name as a local file pending upload/download, and when the hash value of the object data contents either differs from the local file's hash or the hash comparison cannot be performed without the metadata information.
        Parameters:
        service -
        bucketName -
        targetPath -
        objectsWithoutMetadata -
        objectKeyToFilepathMap -
        progressWatcher - watcher to monitor bytes read during comparison operations, may be null.
        eventListener -
        Returns:
        mapping of keys to StorageObjects
        Throws:
        ServiceException
        See Also:
        populateObjectMap(String, StorageObject[])
      • populateObjectMap

        public Map<String,​StorageObject> populateObjectMap​(String targetPath,
                                                                 StorageObject[] objects)
        Builds a map of key/object pairs each object is associated with a key based on its location in the service target path.
        Parameters:
        targetPath -
        objects -
        Returns:
        a map of keys to StorageObjects.
      • isSkipSymlinks

        public boolean isSkipSymlinks()
        Returns:
        true if the "filecomparer.skip-symlinks" configuration option is set.
      • isUseMd5Files

        public boolean isUseMd5Files()
        Returns:
        true if the "filecomparer.use-md5-files" configuration option is set.
      • isGenerateMd5Files

        public boolean isGenerateMd5Files()
        Returns:
        true if the "filecomparer.generate-md5-files" configuration option is set.
      • isSkipMd5FileUpload

        public boolean isSkipMd5FileUpload()
        Returns:
        true if the "filecomparer.skip-upload-of-md5-files" configuration option is set.
      • isAssumeLocalLatestInMismatch

        public boolean isAssumeLocalLatestInMismatch()
        Returns:
        true if the "filecomparer.assume-local-latest-in-mismatch" configuration option is set.
      • getMd5FilesRootDirectoryFile

        public File getMd5FilesRootDirectoryFile()
                                          throws FileNotFoundException
        Returns:
        the file represented by the configuration option "filecomparer.md5-files-root-dir" or null if this option is not specified.
        Throws:
        FileNotFoundException