Middleware is a software architecture concept that refers to integration of disparate applications to facilitate reliable communication. Middleware frequently relies on encapsulating inter-application communications in the concept of an “message”, and often has the ability to queue or perform optimized delivery or copying of messages to various applications.
Common types of middleware include EAI (“Enterprise Application Integration”) middleware such as ESB (“Enterprise Service Bus”) or the older MOM (“Message-Oriented Middleware”).
File transfer applications are themselves often used as middleware, helping to facilitate bulk data transfers between applications using standards such as FTP. Managed file transfers often include the ability to perform some intelligent routing of data and sensitivity to particular transmission windows set by the business.
ESB is short for “Enterprise Service Bus“, a modern integration technology used to quickly tie heterogeneous applications across different operating systems, platforms and deployment models.
Enterprise Application Integration (“EAI”) is a methodology which balances seamless experience across heterogeneous enterprise applications and datasets of various origins, scope and capability with the need to make major changes to those applications or datasets.
Today, EAI often uses ESB (“Enterprise Service Bus”) infrastructure to allow these various applications to communicate with each other. Before ESB, MOM (“Message-Oriented Middleware”) would have been used instead.
Today’s convergence of file transfer and EAI systems was foretold by Steve Cragg’s 2003 white paper entitled “File Transfer for the Future – Using modern file transfer solutions as part of an EAI strategy”. In that paper, Cragg wrote that, “judicious use of file transfer in its modern form as part of an overall EAI strategy can reduce overall business risk, deliver an attractive level of ROI, speed time to market for new services and enable new business opportunities quickly (for example B2B).”
An Enterprise Service Bus (“ESB”) is a modern integration concept that refers to architectural patterns or specific technologies designed to rapidly interconnect heterogeneous applications across different operating systems, platforms and deployment models.
ESBs include a set of capabilities that speed and standardize a Service-Oriented Architecture (“SOA”), including service creation and mediation, routing, data transformation, and management of messages between endpoints.
With the rise of SOA in the mid-2000′s, ESBs took over from MOM (“Message-Oriented Middleware”) as the leading technology behind EAI (“Enterprise Application Integration”).
Examples of commonly deployed ESBs include MuleSoft’s open source Mule ESB, IBM WebSphere ESB, Red Hat JBoss and Oracle ESB. The Java Business Integration project (“JBI”) from Apache is also often referred to as an ESB.
EAI is short for “Enterprise Application Integration“, a methodology which balances seamless experience across heterogeneous enterprise applications and datasets of various origins, scope and capability with the need to make major changes to those applications or datasets.
Message-Oriented Middleware (“MOM”) is software that delivers robust messaging capabilities across heterogeneous operation systems and application environments. Up through the early 2000′s MOM was the backbone of most EAI (“Enterprise Application Integration”) inter-application connectivity. Today, that role largely belongs to to ESB (“Enterprise Service Bus”) infrastructure instead.
In the context of file transfer, MOM stands for “Message-Oriented Middleware“, which is software that delivers robust messaging capabilities across heterogeneous operation systems and application environments.
The term “FTP with PGP” describes a workflow that combines the strong end-to-end encryption, integrity and signing of PGP with the FTP transfer protocol. While FTPS can and often should be used to protect your FTP credentials, the underlying protocol in FTP with PGP workflows is often just plain old FTP.
BEST PRACTICE: (If you like FTP with PGP.) FTP with PGP is fine as long as care is taken to protect the FTP username and password credentials while they are in transit. The easiest, most reliable and most interoperable way to protect FTP credentials is to use FTPS instead of non-secure FTP.
BEST PRACTICE: (If you want an alternative to FTP with PGP.) The AS1, AS2 and AS3 protocols all provide the same benefits of FTP over PGP, plus the benefit of a signed receipt to provide non-repudiation. Several vendors also have their own way to provide the same benefits of FTP with PGP without onerous key exchange, without a separate encrypt-in-transit step or with streaming encryption; ask your file transfer vendors what they offer as an alternative to FTP with PGP.
The term “package” can mean different things in different file transfer situations.
“Installation package” – A file that contains all the executables, installation scripts and other data needed to install a particular application. This file is usually a compressed file and is often a self-extracting compressed file.
“Package sent to another person” – Very similar in scope to email’s “message with attachments”. This is a term that has rapidly gained acceptance (since about 2008) to describe what gets sent in “person-to-person” transmission. A package may contain zero or more files and a plain or richly formatted text message as its payload. A package will also always contain system-specific metadata such as sender/receiver identity, access control attributes, time-to-live information and quality of service information.
“Installation package” is the earlier context of the term “package” ; if you’re dealing with server teams or transmission specialists who deal primarily with system-to-system transfers then “installation package” is probably what they mean when they say “package”.
“Package sent to another person” has evolved as file transfer vendors gradually align the terminology of person-to-person file transfers with physical parcel transfers like those done by UPS or FedEx. In physical parcel transfers, individual packages may contain a variety of things but each is specifically addressed, guaranteed to be delivered safely and intact and each has its own level of service (e.g., 2nd day vs. overnight). The term “packages” is similarly used with many person-to-person file transfer solutions to help non-technical people understand the concept in a different context.
Quality of Service (or “QOS”) is the ability to describe a particular level of service and then intelligently allocate resources to reliability provide that level of service. A common example of general QOS capabilities is found in the “traffic shaping” features of routers: different types of traffic (e.g., web surfing, videoconferencing, voice, etc.) share a common network but allocations are intelligently made to ensure the proper prioritization of traffic critical to the business.
In a business context, QOS is closely associated with a partner’s ability to meet it’s Service Level Agreements (SLAs) or an internal department’s ability to meet its Operating Level Agreements (OLAs).
In a technical context, file transfer QOS typically involves one or more of the following factors:
- Traffic shaping – ensuring that FTP, SSH and other file transfer traffic continues to operate in flooded network conditions. (Many types of file transfer traffic, including FTP and SSH traffic, are often easy to spot.)
- Network timeouts and negotiation responses – ensuring that long-running sessions are either allowed or denied or throttling TCP negotiations (either speeding up to ensure the initial attempt survives or scaling back to limit the effects of rapid-fire script kiddie attacks).
- Any of the major components of a file transfer SLA – e.g., availability of file transfer services, round-trip response time for file submissions or completion of particular sets of work.
QOS stands for “Quality Of Service”. See “Quality of Service” for more information.
Self-provisioning is the ability for individual end users and partners to set up (or “provision“) their own accounts.
Self-provisioning is a common element of most cloud services but remains relatively rare in file transfer applications. A major difference between those environments is that self-provisioning in cloud services usually involves linking a credit card or other form of payment to each provisioned account. This gives cloud services two important things that encourage the use of self-provisioning: a third-party validation of a user’s identity and an open account to bill if things go astray. File transfer environments also involve a lot of trusted links and require, either by law or regulation, human intervention before such a link is authorized.
BEST PRACTICE: Self-provisioning may or may not be right for your environment. As is the case with many types of automation, evaluation of this technology in a file transfer environment should involve a cost-benefit analysis of manually provisioning and maintaining groups of users vs. building a self-provisioning application that meets your organization’s standards for establishing identity and access. An common alternative that lies between manual provisioning and self-provisioning is the ability to delegate permission to provision a partner’s user to a particular partner’s administrator. (File transfer Community Management often involves delegating provisioning privileges this way.)
To onboard a user or onboard a partner is to set up all the necessary user accounts, permissions, workflow definitions and other elements necessary to engage in electronic transfers of information with those users and partners.
Automatic onboarding of users or partners usually involves external authentication technology of some kind. When that technology involves particularly rich user or partner profiles and allows users and partners to maintain their own information, then the external authentication technology used to onboard users and partners is often called “Community Management” technology.
“On board” and “on-board” are also occasionally used instead of “onboard”, and administrators often use the phrases “onboard a user” and “provision a user” interchangeably. See “provisioning” for more information.
SLA is an abbreviation for “Service Level Agreement“, which is a specific contract between a customer and a provider that lays out exactly what each side can expect from the other. The minimum amount of work and minimum level of due care that a file transfer operations team is responsible for is often determined by the SLAs it must meet.
See “Service Level Agreement” for more information.
External authentication is the use of third-party authentication sources to decide whether a user should be allowed access to a system, and often what level of access an authenticated user enjoys on a system.
In file transfer, external authentication frequently refers to the use of Active Directory (AD), LDAP or RADIUS servers, and also refer to the use of various single sign on (SSO) technologies.
External authentication sources typically provide username information and password authentication. Other types of authentication available include client certificates (particularly with AD or LDAP servers), PINs from hardware tokens (common with RADIUS servers) or soft/browser tokens (common with SSO technology).
External authentication sources often provide file transfer servers with the full name, email address and other contact information related to an authenticating user. They can also provide group membership, home folder, address book and access privileges. When external authentication technology involves particularly rich user or partner profiles and allows users and partners to maintain their own information, then the external authentication technology used to onboard users and partners is often called “Community Management” technology.
See also “provisioning” and “deprovisioning“.
A transmission window is a window of time in which certain file transfers are expected or allowed to occur.
Transmission windows typically reoccur on a regular basis, such as every day, on all weekdays, on a particular day of the week, or on the first or last day of the month or quarter.
Most transmission windows are contiguous and set on hourly boundaries (e.g., from 3:00pm to 11:00pm) but can also contain breaks (e.g., 3-6pm and 7-9pm) and start/end on minute/second boundaries (e.g., from 3:05:30pm to 7:54:29).
Files received outside of transmission windows are not immediately processed or forwarded by file transfer systems. Instead, they are typically stored or queued, and are usually processed when the transmission window opens back up. (e.g., a file received at 7:58am for an 8am-2pm transmission window would be processed today at 8:00am, however a file received at 2:02pm would be processed tomorrow at 8:00am.)
When transmission windows are coupled with specific types of file transfers, service level agreements (SLAs) can be written to lock down expectations and reduce variability.
See also: cut-off time.
BEST PRACTICE: When possible, select file transfer scheduling technology that allows you to maintain transmission windows separate from your defined workflows. For example, if you want to change the transmission window for a particular type of file across 50 customer workflows, make sure you can do so by only changing one window definition, not 50 workflow definitions. Also look for technology that allows you to see and control the contents of queued files received outside of transmission windows. Your operators may want to allow, roll over or simply delete some of the files received outside any particular transmission window.
A file transfer service level agreement (SLA) establishes exactly what a particular customer should expect from a particular file transfer provider, and how that customer should seek relief for grievances.
A file transfer SLA will often contain the following kinds of service expectations:
Availability: This expresses how often the file transfer service is expected to be online. An availability SLA is often expressed as a percentage with a window of downtime. For example: “99.9% uptime except for scheduled downtime between 2:00am and 5:00am on the second Sunday of the month.”
Different availability SLAs may be in effect for different services or different customers. Availability SLAs are not unique to file transfer; most Internet-based services contain an availability SLA of some kind.
Round-Trip Response Time: This expresses how fast a complete response to a submitted file will be returned. A round-trip response time SLA is often expressed as a certain length of time. For example, “we promise to return a complete response for all files submitted within 20 minutes of a completed upload”. Sometimes a statistical percentage is also included, as in “on average, 90% of all files will receive a completed response within 5 minutes.”
The reference to “round-trip” response time rather than just “response time” indicates that the time counted against the SLA is the total time it takes for a customer to upload a file, for that file to be consumed and processed internally, and for any response files to be written and made available to customers. Simple “response time” could just indicate the amount of time it would take the system to acknowledge (but not process) the original upload.
Different round-trip response times SLAs may be in place for different customers, types of files or times of day. Round-trip response time SLAs are similar to agreements found in physical logistics: “if you place an order by this time the shipment will arrive by that time.”
Completed Body of Work: This expresses that a particular set of expected files will arrive in a particular window and will be appropriately handled, perhaps yielding a second set of files, within the same or extended window. For example, “we expect 3 data files and 1 control file between 4pm and 8pm everyday, and we expect 2 response files back at any time in that window but no later than 9pm”
Files in a body of work can be specified by name, path, size, contents or other pieces of metadata. There are typically two windows of time (“transmission windows“) associated with a body of work: the original submission window and a slightly larger window for responses.
SLAs can be set up between independent partners or between departments or divisions within an organization. A less stringent form of SLA known as an operating level agreement (OLA) when it is between two departments in the same organization, especially when an OLA is set up to help support a customer-facing SLA.
BEST PRACTICE: A good file transfer SLA will contain expectations around availability and either round-trip response times or expected work to be performed in a certain transfer window, as well as specific penalties for failing to meet expectations. File transfer vendors should provide adequate tools to monitor SLAs and allow people who use their solutions to detect SLA problems in advance and to compensate customers appropriately if SLAs are missed.
In file transfer, a “translation engine” is a common name for a “transformation engine” that converts documents from one document definition to another through “transformation maps“.
See “transformation engine” for more information.
In file transfer, a “mapper” is a common name for a “transformation engine” that converts documents from one document definition to another through “transformation maps“.
See “transformation engine” for more information.
A translation engine is software that performs the work defined in individual transformation maps.
The transformation engines that power transformation maps are typically defined as “single-pass” or “multiple-pass” engines. Single-pass engines are faster than multiple-pass engines because documents are directly translated from source formats to destination formats, but single-pass engines often require more manual setup and are harder to integrate and extend than multiple-pass engines. Multiple-pass engines use an intermediate format (usually XML) between the source and destination formats; this makes them slower than single-pass engines but often eases interoperability between systems.
BEST PRACTICE: Your decision to use a single- or multiple-pass map transformation engine should be predicated first on performance, then on interoperability. (It won’t matter how interoperable your deployment is if it can’t keep up with your traffic.) However, the ever-increasing speed of computers and more common use of parallel, coordinated systems is gradually tilting the file transfer industry in favor of multiple-pass transformation engines.
In file transfer, a “document definition” typically refers to a very specific, field-by-field description of a single document format (such as an ACH file) or single set of transaction data (such as EDI’s “997″ Functional Acknowledgment).
Document definitions are used in transformation maps and can often be used outside of maps to validate the format of individual documents.
The best known example of a document definition language today is XML’s DTD (“Document Type Definition”).
Many transformation engines understand XML DTDs and some use standard transformation mechanisms like XSLT (“XML Transformations”). However most transformation engines depend on proprietary mapping formats (particularly for custom maps) that prevent much interoperability from one vendor to another.
A transformation map (or just “map”) provides a standardized way to transform one document format into another through the use of pre-defined document definitions.
A single transformation map typically encompasses separate source and destination document definitions, a field-by-field “mapping” from the source document to the destination, and metadata such as the name of the map, what collection it belongs to and which people and workflows can access it.
It is common to “develop” new maps and document formats to cope with document formats unique to a specific organization, trading arrangement or industry. (The term “development” is still typically used with maps in the file transfer industry, even though most mapping interfaces are now 99%+ drag-and-drop.)
BEST PRACTICE: Most transformation engines (especially those tuned for a particular industry) now come with extensive pre-defined libraries of common document formats and maps to translate between them. Before investing in custom map development, research available off-the-shelf options thoroughly.
In file transfer, a “map” is usually short for “transformation map“, which provides a standardized way to transform one document format into another through the use of pre-defined document definitions.
See “transformation map” for more information.
Provisioning is the act of adding access to and allocating resources to end users and their file transfer workflows. It is often used interchangeably with the term “onboarding“.
The act of provisioning should always be audited, and the audit information should include the identity of the person who authorized the act and any technical actions the system took to provision the user.
Most file transfer servers today allow administrators to chain up to Active Directory (AD), LDAP or RADIUS or other external authentication to allow centralized management (and thus provisioning) of authentication and access. However, provisioning of customer-specific workflows is often a manual procedure unless standard workflows are associated with provisioning groups.
Automated provisioning of users through import capabilities, APIs and/or web services is a competitive differentiator across different file transfer servers, and varies widely from “just establish credentials”, through “also configure access” and on to “also configure workflows”.
Use of external authentication usually makes migration from one file transfer technology to another much easier than when proprietary credential databases are in use. When external authentication is in use, end users usually do not need to reset their current passwords. However,when proprietary credential databases from two different vendors (or sometimes two different products from the same vendor) are involved, it is common that every end user will have to change his or her password during migration.
BEST PRACTICE: Whenever possible, implementers of file transfer technology should use an external authentication source to control access and privileges of end users. When an external authentication source is used to control authentication in this manner, provisioning on the file transfer server can occur at any moment after the user is created or enabled on the central authentication server.
See also “deprovisioning” and “onboarding“.
Deprovisioning is the act of removing access from and freeing up resources reserved by end users and their file transfer workflows. Rapid removal of access upon termination or end of contract is key to any organization. Freeing up of related resources (such as disk space, certificates, ports, etc.) is also important, but often follows removal of access by a day or more (especially when overnight processes are used to free up resources).
The act of deprovisioning should always be audited, and the audit information should include the identity of the person who authorized the act and any technical actions the system took to deprovision the user.
Most file transfer servers today allow administrators to chain up to Active Directory (AD), LDAP or RADIUS or other external authentication to allow centralized management (and thus deprovisioning) of authentication and access.
“Rollback” of deprovisioned users is a competitive differentiator across different file transfer servers, and varies widely from “just restore credentials”, through “also restore access” and on to “also restore files and workflows”.
BEST PRACTICE: Whenever possible, implementers of file transfer technology should use an external authentication source to control access and privileges of end users. When an external authentication source is used to control authentication in this manner, deprovisioning on the file transfer server occurs at the moment the user is disabled or deleted on the central authentication server.
See also “provisioning“.
A “clear text password” is a common problem in file transfer security. It is a dangerous problem because it exposes credentials that allow unauthorized individuals to act with the identity and permission of trusted individuals and systems.
The problem happens in at least five different areas:
Clear text password during input: This problem occurs when end users type passwords and those passwords remain visible on the screen after being typed. This exposes passwords to “shoulder surfing” and others who may share a desktop or device. Most applications today show asterisks while a password is being typed. Modern implementations (such as the standard iPhone password interface) show the last character typed for a few seconds, then replace it with an asterisk.
Clear text password during management: This problem occurs when an operator pulls up a connection profile and can read the password off the profile when he/she really only should be using an existing profile. To avoid this problem application developers need to code a permissions structure into the management that permits use without exposing passwords. Application developers also need to be careful that passwords are not accidentally exposed in the interface, even under a “masked” display. (Perform a Google search on “behind asterisks” for more information on this.)
Clear text password during storage: This problem happens when configuration files, customer profiles or FTP scripts are written to disk and no encryption is used to protect the stored data. Application developers can protect configuration files and customer profiles, but when FTP scripts are used, alternate authentication such as client keys are often used instead of passwords.
Clear text password in trace logs: This problem occurs when passwords are written into trace logs. To avoid this problem application developers often need to special-case code that would normally dump clear text passwords to write descriptive labels like “*****” or “7-character password, starting with X” instead.
Clear text password on the wire: This problem occurs when passwords are sent across a network. To avoid this problem secure transport protocols such as SSL/TLS or SSH are often used. The most frequent cause of this problem is not application defects but operator error: when an administrator accidentally configures a client to connect without using a secure protocol, credentials are often sent in the clear.
BEST PRACTICE: All modern file transfer clients and file transfer servers should steer clear of these problems; these are entry-level security concerns and several application security design guidelines (e.g., “Microsoft’s Design Guidelines for Secure Web Applications” and the SANS Institute’s “Security Checklist for Web Application Design”) have covered this for years.
A file transfer protocol that is “firewall friendly” typically has most or all of the following attributes:
1) Uses a single port
2) Connects in to a server from the Internet
3) Uses TCP (so session-aware firewalls can inspect it)
4) Can be terminated or proxied by widely available proxy servers
Active-mode FTP is NOT firewall friendly because it violates #1 and #2.
Most WAN acceleration protocols are NOT firewall friendly because they violate #3 (most use UDP) and #4.
SSH’s SFTP is QUITE firewall friendly because it conforms to #1,2 and 3.
HTTP/S is probably the MOST firewall friendly protocol because it conforms to #1, 2, 3 and 4.
As these examples suggest, the attribute file transfer protocols most often give up to enjoy firewall friendliness is transfer speed.
When proprietary file transfer “gateways” are deployed in a DMZ network segment for use with specific internal file transfer servers, the “firewall friendliness” of the proprietary protocol used to link gateway and internal server consists of the following attributes instead:
1) Internal server MUST connect to DMZ-resident server (connections directly from the DMZ segment to the internal segment are NOT firewall friendly)
1) SHOULD use a single port (less important than #1)
3) SHOULD uses TCP (less important than #2)
Non-repudiation (also “nonrepudiation”) is the ability to prove beyond a shadow of doubt that a specific file, message or transaction was sent at particular time by a particular party from another party. This proof prevents anyone from “repudiating” the activity: later claiming that the file, message or transaction was not sent, that it was sent at a different time, sent by a different party or received by a different party. (“Repudiate” essentially means “reject”.)
Non-repudiation is important for legal situations where fraud through fake transactions could occur, such as a string of bad ATM transactions. However, it is also an important assumption behind most day-to-day processing: once a request occurs and is processed by an internal system, it’s often difficult and expensive to reverse.
The technology behind non-repudiation is often built on:
- Strong authentication, such as that performed with X.509 certificates, cryptographic keys or tokens.
- Cryptographic-quality hashes, such as SHA256, that ensure each file’s contents bear their own unique fingerprint. (The fingerprints are stored, even if the data isn’t.)
- Tamper-evident logs that retain date, access and other information about each file sent through the system.
Some file transfer protocols, notably the AS1, AS2 and AS3 protocols (when MDNs are in use), have non-repudiation capabilities built into the protocols themselves. Other protocols depend on proprietary protocol extensions (common in FTP/S and HTTP/S) or higher-level workflows (e.g., an exchange of PGP-encrypted metadata) to accomplish non-repudiation.