Tuesday, December 3, 2013

QT015: Use Filter Recurring Nodes to Apply XPath Predicates within Map Activities

Quick Tip #015: Use Filter Recurring Nodes to Apply XPath Predicates within Map Activities

The Cast Iron Map Activity is a powerful and easy to use tool for transforming XML documents.  For most transformations the simple point and click / drag and drop interface for mapping fields and inserting functions is intuitive and for the most part self explanatory.  However, there are a few features that are not very well known and are sometimes forgotten about because they can only be accessed by right click menus.  This article will cover the usage of one such feature, the Filter Recurring Nodes feature.

What is Filter Recurring Nodes?

The Filter Recurring Nodes option, which is accessible by right clicking a recurring node on the target side of a Map, allows you to apply an XPath predicate to the source document as part of the map.  XPath is a language for navigating XML hierarchies, it has a simple syntax and includes various functions for transforming data.  An XPath predicate is a filter that can be applied within an XPath expression to filter nodes that meet the criteria expressed in the XPath predicate.  Predicates can use sub-path expressions functions and comparison operators to identify the nodes that should be included in the filter.

How do I use Filter Recurring Nodes?

The best way to understand how to use Filter Recurring Nodes is with an example.  For this example we will use a common design pattern when working with the Salesforce.com Connector.  When working with the Salesforce.com Connector you may have noticed that in order to determine whether or not your operation completed successfully you must check the results output parameter of the Salesforce.com Activity.  The results output parameter is an XML document which contains a result element for each sObject passed to the input parameter of the connector.  Each result contains a boolean flag indicating whether or not the operation on the associated record was successful.  (Results are outputted in the same order as the input data)  Assuming you want to report on all the records that errored, you will need to collect those records and write the error messages to a database or send them in an email.
Without filter recurring nodes you might use a for-each loop on the results object with an if-then and an expand occurrences map.  This is a very inefficient way to collect these records (expand occurrences maps are very inefficient), also your orchestration would be cluttered with a lot of unnecessary activities which increases the potential for mistakes and hinders readability.  Fortunately, filter recurring nodes provides an efficient and compact way to accomplish this goal.
First, we will create a Salesforce.com Upsert Activity and go to the Map Outputs Pane.  Create two variables based on the results output parameter of the activity and name them successes and failures.  Add the newly created variables to the map.  Now you can drag the recurring result node of the output parameter to the recurring result node of both the successes and failures variables on the target side of the map.  Next right click on the result recurring node of the successes variable and choose Filter Recurring Nodes.  This will open the filter recurring nodes dialog.  You will see an XPath expression with an empty predicate (/results/result[]).  Now you can fill in the predicate to complete the expression and filter for only nodes where success equals true.  To do this enter the following text into the box:
*:success = true()  
Repeat this filter recurring nodes step for the failures variable and use the XPath predicate:
*:success = false()

That's it you now have a variable called successes that contains all the success records and a variable called failures which contains all the errors.  The failures variable can now be transformed to an email message or logged to a database, etc.

How do I know if Filter Recurring Nodes has been Applied?

When reviewing an orchestration, especially one created by another developer, it is important to understand whether or not this feature is being used and which XPath predicate has been applied.  When a map contains a Filter Recurring Nodes condition, an icon is displayed next to the recurring node where it was applied.  You may also hover over the the node to see the XPath predicate that was used.  See the screenshot below, the Filter Recurring Nodes icon and the predicate are highlighted in green.


Notes on XPath predicates

You may have noticed a couple things about the predicates above.  First, the *: before the success fieldname.  This is a namespace wildcard and is a common idiom in XPath expressions such as this because it is often difficult to know which namespace prefixes have been declared.  Be careful when using this wildcard that you do not happen to have two fields in your document with the same name and different prefixes.  Second, the use of the true() function instead of a literal true.  XPath does not reserve a true keyword because XML does not reserve true as a keyword.  Therefore the literal string true would match to a node named true rather than the boolean value.  To get around this limitation XPath provides the true() and false() functions which return their respective boolean values.

Further Reading

In order to use this feature effectively you will want to have a good background in XPath predicates the following resources may be helpful in understanding XPath:

Friday, August 2, 2013

QT014: A Quick Note on Upgrading to 6.3.0.0 When Using HTTP Receive

Quick Tip #014:  A Quick Note on Upgrading to 6.3.0.x When Using HTTP Receive

Several of our customers have experienced issues related to migrating existing orchestrations that begin with an HTTP Receive Activity to version 6.3.0.x.  (We've seen this issue in both 6.3.0.0 and 6.3.0.1, we have not tested 6.4.0.0 yet)  I believe that the issues mostly come up when you are trying to manually parse the URI and simple use cases that don't do anything with the URI seem to work just fine.  Our recommendation is that if you are using the HTTP Receive Activity as a starter for any of your orchestrations that you thoroughly regression test your orchestration.  Additionally, if you are manually parsing the URI we recommend rebuilding the Activity to use the built in parsing functionality.  See this post for more details on the new features of the HTTP Receive Activity.  There are two main problems that we have encountered so far:

  • The URI string is now passed to the orchestration URLEncoded.  If you were previously parsing this value with javascript or a flat file definition you may want to consider using the new built in functionality or using the new Http Header Functions available in the functions tab to parse out the path or extract a query parameter.
  • Certain URI strings cause the Activity to throw an exception and prevent the job from starting.  This is a bug that we discovered at a client today, if you have a query string in your URL with a parameter but no value, this will cause the HTTP Receive Activity to throw and exception.  If you encounter this issue, the work around is to make sure that you pass a value to all your query parameters. 
    • this throws an exception: http://www.example.com/MyTestOrchestration?value
    • this does not throw an exception: http://www.example.com/MyTestOrchestration?parameter=value
Both of these problems have easy work arounds, and certainly the new functionality for parsing path and query parameters are welcome enhancements.  However, as always when upgrading to a new version, be sure to thoroughly regression test your orchestrations.

Monday, July 1, 2013

QT013: Resolving DNS Resolution Issues in CIOS

Quick Tip #013: Resolving DNS Resolution Issues in CIOS

The Domain Name System (DNS) is a global distributed network of servers that is used to resolve hostnames such as blog.conexus-inc.com to an IP address which can be used by the network layer to connect to a remote host.

Common DNS Issues

DNS issues usually manifest themselves as UnknownHostExceptions in the system log.  An UnknownHostException is generated when an activity tries to resolve the hostname of a remote host and the DNS server either does not respond or does not have an entry for that host.  Another common issue occurs when your Cast Iron appliance resides behind the same firewall as the remote host and the DNS server returns the external address rather than the internal address.

Ensure DNS Servers are Properly Configured

You may specify multiple DNS servers in the networking configuration of CIOS.  This can be done via the Command Line Interface (CLI) using the net set nameserver command or under the networking panel in the Web Management Console  (WMC).  The first step in troubleshooting DNS is making sure that these settings are correct.  If you are able to resolve other hostnames and have isolated the issue to a specific remote host, there are a few other options to consider.

Use the IP Address Instead

Often you can bypass the DNS resolution process by replacing the hostname for a remote server with its DNS address.  This is the simplest solution, however, there are some drawbacks to it under certain conditions.  By bypassing the DNS system you are taking responsibility for ensuring that if the IP address of the remote host changes, the change is made in your configuration properties.  Some endpoints such as Domino connect to a gateway which redirects the connection to another hostname.  In these cases, specifying the gateways IP address will not resolve the problem unless the gateway is configured to redirect to an IP address rather than a hostname.  In this case you will need to make sure that cast iron can resolve the hostname.  Also, there are SSL implications to using the IP address instead of the DNS entry.  SSL typically checks that the Common Name of a certificate presented by a remote server matches the hostname used to resolve it.  If you use the IP address this check will fail.  You could disable hostname verification, but there is a better way . . .

Add an etc/hosts Entry

CIOS is, underneath the covers, a linux server and linux servers do have their own internal name resolution process that happens before reaching out to the DNS server.  If your DNS servers do not properly resolve a given hostname you may statically add it to the etc/hosts file.  When resolving a hostname, CIOS will first check for an etc/hosts entry and only if the address is not resolved by etc/hosts, contact the DNS server.  Again, by bypassing DNS you are taking responsibility for maintaining the hostname to IP address mapping.  However, this method has the benefit of allowing your orchestrations to use the actual hostname to connect which means that you can maintain the hostname to IP mapping in one place and SSL can perform hostname verification.  etc/hosts entries can be added via the CLI by using the net add etchost address <ip> hostname <name> command where <ip> is the remote hosts ip address and <name> is the FQDN.

Monday, June 3, 2013

QT012: Calling Apex Web Services from IBM Cast Iron

Quick Tip #012: Calling Apex Web Services from IBM Cast Iron

The standard interface between Cast Iron and Salesforce.com is the the Salesforce SOAP API.  The SOAP API provides standard CRUD operators that would with all standard and custom objects in salesforce.com.  This is a powerful and robust interface, however, you may want to encapsulate complex business logic within salesforce as you might find in a database stored procedure.  This is possible to accomplish using Apex Web Services.

Exposing Apex Web Services with Salesforce

Salesforce.com allows you to expose your own custom web services using the Apex Programming Language.  Method declarations in apex prefixed with the webservice keyword will be exposed as operations in an Apex Webservice.  Webservice methods must be declared as static methods within a global class.  

global class MyWebService{
 ...
 webservice static String getName(Id id){
  Account a = [select Name from Account WHERE id=:id];
  return a.Name;
 }
 ...
}

Any static method declared within a gobal class with the webservice keyword will be exposed as a webservice, to generate a WSDL for the class navigate to the class page under the setup screen in SFDC and click the Generate WSDL button.  The generated WSDL will contain the definition for all web service methods in your class. 


Can I Use Complex Types in an Apex Web Service?

You can define complex data structures to pass to and return from your web service operations by defining a global class to contain the data and declaring the fields of your class with the web service keyword.
global class MyReturnParams{
 webservice Id id;
 webservice String message;
}

Calling Apex Web Services from Cast Iron

Once you have created your Apex Web Service and downloaded the WSDL, you can import the WSDL into Cast Iron Studio and create a Web Service Endpoint.

You are now ready to add your invoke Webservice Activity to your orchestration.

What About Authentication?

Salesforce.com Web Services uses a proprietary authentication handshake to get a token which must be passed to your Apex Web Service.  These tokens can get stale and expire, so to avoid building logic to deal with this, we will let Cast Iron do it for us.  CIOS has built in logic to maintain a pool of sessions with SFDC so we will leverage the SFDC Connector to get a token that we can pass to our Web Service.  To do this, we will need a Salesforce.com endpoint and we will need to create an Activity to interact with SFDC and get a SessionHeader.  Any activity will do, so we will use a simple get Server Timestamp call to Salesforce.  If you have used the SFDC connector before you may have noticed that the connector returns the session header that it used to make the call to SFDC.  You will need to copy this header and pass it in to your Apex Webservice call.  You will also need to capture the server URL that is returned and use it to construct the appropriate URL for your WSDL.  Because server urls can change between environments and may change over time, it is a good idea not to hard code the URL of your instance and instead use the one that SFDC returns.  To do this you will need to write a simple JavaScript to parse the instance of your organization and use that to build the URL to pass to the location optional parameter of your invoke web service activity.

There are a few other required parameters that you will need to pass to various headers in the web service call.  To see the headers, right click the target side of your Map Inputs step on the invoke WebService Activity and choose show optional parameters.  Here is a list of the required parameters and what they do:
  • headers/SessionHeader/SessionId: This is where you will map the session id from the get Server Timestamp activity.
  • headers/DebuggingHeader/DebugLevel:  This field determines how much debugging info SFDC will return to your call.  For production this value should be None, as there is a governor limit for returning debugging details.  You want to avoid exceeding the limit on calls that do not need to be debugged so you can actually get debugging info when necessary.
  • headers/AllowFieldTruncationHeader/allowFieldTruncation:  This field defines the default behavior for the Database.DMLOptions allowFieldTruncation property.  Prior to version 15.0 SFDC would truncate strings when their value exceeded the size of the field.  This behavior was changed in version 15.0 to default to throwing an error when a string exceeds the field length.  This parameter allows you to specify the previous behavior as the default.  I believe that this parameter will not override the property if it is specified in your apex code.
Note: It appears that recent versions of SFDC now have a Security header which allows you to specify a username and password for authentication.  This option could be used instead of the traditional method of obtaining a session header.  I have never used this method, but it seems that there are some trade offs to consider here.  If you use the above method, you actually make two calls to SFDC one to get a sessionId and another to invoke the web service.  (However, any call to SFDC will return the header, and I often find that I can get the session header from another call within an orchestration such as a query that must be done before the invoke web service)  Specifying a username and password would allow you do everything in one step, however, you would be authenticating for each call to SFDC which can be an expensive operation in itself.  Also, the above method uses SFDC to retrieve the appropriate server URL from the login server.  Without that step you would have to hard code the server, which could change if SFDC moves your instance.  Salesforce.com provides plenty of warning when they do make changes, so that may not be a major concern.  More investigation in needed to determine if the security header provides a cleaner solution to this problem. 

Thursday, May 23, 2013

AP001: High Availability for IBM Cast Iron

Architectural Patterns #001: High Availability for IBM Cast Iron

Some folks out there have been asking about High Availability (HA).  With CIOS there are two general purpose High Availability options: For physical appliances you can use an HA Pair setup to provide high availability, in a HyperVisor environment you have several levels of HA build into VMWare.  Every situation is different but we typically recommend VMWare as an HA mechanism because it offers more flexibility and many of our customers already have VMWare infrastructure and expertise.

First a bit of Background on High Availability and Fault Tolerance 

When designing a system you inevitably spend a lot of time thinking about what happens when something goes wrong.  Error handling logic is often the most time consuming part of system design, this must inevitably extend outside of your orchestrations to the platform itself.  System availability as measured by percentage uptime is a common metric used when defining an Service Level Agreement (SLA).  For example a system with 99% uptime can be down for roughly 1.5 hours a week.  A system with 99.999% (five-nines is a common idiom when it comes to availability), the system can be down for about 5 minutes every year.  Typically, when we talk about System Availability, we are concerned with maximizing the amount of time that this system is available to process transactions.  There are two main reasons a system can be unavailable: maintenance or a system failure.  Having no maintenance windows in what is typically referred to as a "Zero Downtime" architecture is extremely difficult, we've never attempted this type of scenario with Cast Iron because when it comes down to it not many users can justify the expense of that kind of SLA and simply schedule downtimes when the system is not heavily used and fall back such as allowing transactions to queue can be used to allow system maintenance to occur.

Avoiding downtime due to system failures is referred to as "High Availability" or "Fault Tolerance."  How a system failure will affect system availability depends on how long it will take to recover from an outtage.  Like anything, there is a spectrum of system availability options.  A typical set of options are as follows:
Zero Redundancy: You have no backup unit, you need to call and order a replacement part and wait for it to be delivered and installed before you can power your unit back up.
Cold Spare: You have a backup unit, its sitting in the box in your data center.  You need to unrack the old unit, rack the new unit, plug in the network and power cables.  Boot the unit, patch it, load your projects, configure them, and finally start your orchestrations.
Hot Spare: your spare unit is already racked and patched, with orchestrations loaded, you just need to switch the IP addresses and start your orchestrations.
High Availability: With a high availability solution, the process is now fully automated.  You have reserved capacity to accommodate failover an automated process to recover in under 10 minutes.
Fault Tolerant: With a Fault Tolerant system, the process is fully automated and resources are not only reserved they are already allocated.  Failover is seamless to external systems recovery time is under 10 seconds, ideally instantaneous.

What is a Physical HA Pair and How does it work?

With a physical HA Pair you actually have two physical appliances that are tied together in a Master / Slave setup.  The appliances have special hardware and dedicated network connections between them so they can replicate and detect failure scenarios.  One of the appliances is determined to be the "Active" appliance and the other runs in "Passive" mode.  The Active appliance carries out all the work that a standalone appliance would do however it commits all changes to the Work In Progress (WIP) memory area to the passive appliance.  The WIP is the persistent store that the appliance uses to store the state of all of your variables before any connector activity.  With the WIP replicated to the passive appliance, should anything happen to the Active appliance, the Passive appliance is ready to take over as soon as it detects a failure.  When the Passive appliance takes over, it will take over the MAC addresses of the former Active appliance, therefore to external systems there is no change.  In the System Availability spectrum this solution is somewhere between HA and FT, recovery is automatic and close to instantaneous, however, because the system recovers at that last state of the WIP network connections need to be reestablished to endpoints and you need to understand the nature of the interactions with your endpoints.  Some endpoints support Exactly Once semantics where the connector will guarantee that an operation is only performed once.  For example, the database connector does this by using control tables to synchronize a key between the appliance and the database.  They insert a key into the control table and the presence of that key is checked before repeating an operation, if the key is present, the operation has already been completed.  We generally recommend that you design all processes to be idempotent, so it won't matter if a single interaction with an endpoint is repeated.  This is the easiest way to recover from errors, but often requires careful design to achieve.

What Options Do I have with VMWare?

VMWare actually gives you several levels of High Availability to choose from depending on the resources that you want to allocate to HA.  The simplest option within VMWare is VMWare High Availability.  VMWare High Availability requires a VMWare cluster with VMotion and VMWare HA configured.  In this mode VMWare will detect a server or appliance fault and automatically bring up the VM on a new server in the cluster.  In this mode, there is a potential for some downtime while the appliance is started on the new server, however, the appliance will recover whether it left off using the last state of the WIP before the crash.  The advantage of this setup is that resources do not have to be allocated to a redundant appliance until a failure occurs.  Essentially, the resources required to recover from a failure are reserved not allocated and therefore can be pooled.  VMWare offers a higher level of high availability called VMWare Fault Tolerance.  With VMWare Fault Tolerance, failover resources are preallocated and VMWare is actively replicating the state of your virtual machine to another server. This method provides instantaneous recovery in the event of a failure and unlike a physical appliance the replication goes beyond the WIP, therefore, in Fault Tolerance mode the failover can occur in the middle of an interaction with an external resource transparently.  The disadvantage of this approach is that you need additional dedicated network resources for FT and you need to preallocate the memory and CPU resources for FT.  Therefore, FT effectively requires more than double the resources as HA due to the extra network requirements and load required to replicate the state.  See this post for more details on setting up CIOS HyperVisor Edition.

Active/Active and Load Balancing Scenarios

The active/passive model works well when you want High Availability and your load does not exceed the capacity of a single appliance.  It is a simple but elegant design that provides transparent recovery in the event of failure.  This ease of use is perfectly aligned with what a typical customer expects from Cast Iron.  Further, in our experience a single appliance, when orchestrations are designed properly, is more than adequate for most Cast Iron users.

That being said, there are other options out there for load balancing and high availability using multiple appliances, however, most are dependent on the use case and the endpoints involved.  If you are using CIOS to host web services over HTTP you can use an HTTP load balancer to distribute load across multiple appliances, most HTTP load balancers have some means of detecting failed nodes and redirecting traffic.  For database sources you can use multiple buffer tables and write triggering logic to balance the load.  Other source systems such as SAP and JMS are also easily setup for load balancing across multiple appliances.

In the past we have also used a dispatcher model to distribute load, this is particularly effective when the load is generated by use cases with complex logic which leads to longer running jobs.  With a dispatcher model, however, eliminating the dispatcher as a signal point of failure can prove to be difficult and is use case dependent.

What About Disaster Recovery?

Disaster Recovery (DR) is a question of how do deal with catastrophic failure such as when a hurricane destroys your datacenter.  Again, how quickly you can recover and what level of service you can provide in such an event will depend on architecture and impact budget.  The lowest cost DR solutions are usually manual workarounds to allow business to continue when a catastrophic failure occurs.  True seamless DR requires a remote datacenter with hardware replicating the main data center and automated recovery.  In most DR plans the recovery requires some manual processes, and in most there is ongoing maintenance that needs to occur to keep project versions and patch levels in sync.  Most DR plans call for DR appliances to be racked and mounted and powered up at all times, but that too is a consideration and a cost.  Most customers who opt for a hardware solution will have an HA pair for there main appliance and a single node in a remote data center for DR.  Typically, the DR node is racked, mounted and powered on with all the orchestrations loaded and configured but undeployed.  When it comes time to activate the DR appliance at that point it is theoretically just a matter of starting up the projects on the DR appliance.  Virtual appliance users typically have a DR plan for their virtual infrastructure and Cast Iron falls in line with that plan.  However, planning for DR is typically application specific and requires thinking about the problem from end to end.  You need to understand the DR plan for any endpoints that you are integrating with and also understand where the state of your integrations is stored.  In the end there is a serious cost benefit analysis that must be considered when planning for HA / FT and DR.  The business must decide where the proper balance is between SLA and budget. 

Monday, May 20, 2013

QT011: Copy And Paste Between Projects in Cast Iron Studio


Quick Tip #011: Copy And Paste Between Projects in Cast Iron Studio

Most Cast Iron Studio users already know that you can copy and paste activities.  Cast Iron has good support for this feature and will identify when it needs to create new variables etc.  What a lot of users don't know is that you can actually copy and paste activities between projects.

How Do I Open two Projects at Once?

The key obstacle to cutting and pasting between projects is the fact that Cast Iron Studio only allows you to open one project at a time.  The answer to the question of how to have two projects open at once is actually quite simple: install a second copy of studio.  All you need to do is run the installer a second time and tell it you want to install studio in a new location.  Make sure that you also create a new start menu folder for your second copy of studio, and that is all there is to it.  You can now have two projects open at once.

Once you have the source and target projects open you can copy activities from the source project as you normally would and paste them in to the target project in your second copy of studio as you normally would paste activities.  Whether you right click and choose copy/paste or use the CTRL-C CTRL-V shortcuts, its just that simple.

What Can I Copy?

There is a catch to using this undocumented feature, certain things cannot be copied and may become corrupted when pasted.  Only Activities and their associated variables can be copied and pasted between projects.  Flat File definitions, XML Schemas, WSDLs, and Stylesheets will have to be imported from the source to the target through the Project Tab's Add Document Dialog (remember that for flat file schemas you will have to choose all files to find them in the source project).  Endpoints will need to be recreated and you will likely need to repair some linkages by going to the pick endpoint step of your activities.  Its up to you to decide whether or not this extra work of relinking endpoints and other repair steps are worth it, but if you have a complex map that you do not want to rebuild in a new project it may be worth using this simple trick.

Tuesday, April 30, 2013

Cast Iron Version 6.1.0.15


Cast Iron Version 6.1.0.15

IBM has released a new version of CIOS version 6.1.0.15 was released on 04/30/2013.  See the Release Notes for more information.

What do you need to know about this Release?

This is a FixPack release, it rolls up the prior iFix releases including several fixes for security vulnerabilities as well as numerous runtime and studio bug fixes.

You can get the new release here.  For more information on minor version upgrades see this post.

Monday, April 29, 2013

QT010: Using the IfThen Activity in Cast Iron Studio

Quick Tip #010: Using the If..Then Activity in Cast Iron Studio

Most Cast Iron developers are probably already familiar with the If..Then Activity however there may be some things that you didn't know about it.  The If..Then activity allows you to incorporate conditional logic in your orchestration and only execute certain activities under certain conditions.  When you drag an If..Then activity into your orchestration you can place activities that should only be executed under certain conditions inside the If branch of the If..Then activity.  You may also place activities that should only be executed if the condition is false into the Else branch of the If..Then activity.

Manipulating the Branches

An If..Then activity can have multiple branches each with their own logical test for execution, you may add a branch to an If..Then activity by right clicking in the outer gray box and choosing the add if branch.  You can delete branches by right clicking the if (or else) icon and choosing delete.  (Note if you delete the else branch you may add it back by right clicking inside the gray box and choosing add else)  The logic of each branch behaves like an if(){} else if() {} C or Java statement, not like a switch or case statement.  That is to say that each condition is evaluated from the top down and the branch with the first condition to evaluate to true is executed and processing continues after the If..Then activity.  If none of the conditional expressions are true, then the Else branch is executed (if one exists). You may reorder branches by right clicking on the If icon for the branch and choosing Go Up or Go Down to move the chosen branch up or down in the order of precedence.

Add a branch to an If..Then by right clicking inside the outer gray box and selecting Add "If"

Defining the Logic for If..Then Branches

The logic for an If branch is essentially defined using XPath syntax.  You may use the builder to define your logic or you may click the Advanced button and define the logic as a free form string.  Note that switching back and forth may cause issues because you can create expressions in advanced mode that cannot be expressed with the builder.  Note: if you need to define the order of operations you can use advanced mode and use parens around your expressions to define how the expression should be evaluated.

An example of using advanced mode and parens to change the order of operations.

Using the Builder

The builder allows you to create a set of boolean expressions tied together with the AND and OR logical operators.  Each expression contains a left hand side an equality operator and a right hand side.  Both the left hand side and the right hand side allow you to click an elipses button to choose a field from a variable defined in your orchestration or you may set a static expression.  If you set a static expression, if the expression is a string type be sure to encapsulated it in single quotes.  If it is a boolean use the true() or false() XPath functions to ensure that it is properly evaluated.  You may also use other XPath functions and constructs in your expressions for example a common pattern to determine whether or not an element is null you may want to ensure that the tag is present and that the value is not the empty string.  To do this use a count function to count the number of instances whether the value is not equal to the empty string such as: 
count(bpws:getVariableData('var')/*:body/*:record[*:element!=''])>0 
The above example uses both an XPath function (count) and an XPath predicate to determine whether or not an element is null.  You can use other functions such as starts-with, substring, etc to build even more complex expressions.

There are several equality operators available via the drop down.
You can add multiple conditions tied together with AND/OR logical operators.


Monday, April 15, 2013

QT009: Using Shared Variables in CIOS


Quick Tip #009: Using Shared Variables in CIOS

What is a Shared Variable?

In CIOS a shared variable is a variable that is "Shared" across all jobs for a given orchestration.  By assigning a value to a shared variable in the right side of a map, that value will be accessible by all other jobs of that orchestration.  Like all variables in CIOS shared variables are persisted in the event of power loss.  Shared variables are only cleared when an orchestration is stopped via the WMC.

Run Orchestration Jobs Sequentially

When using a shared variable in an Orchestration, the option to run all jobs in a single instance is automatically checked.  (In prior versions this option was called Run Orchestration Jobs Sequentially)   It is not possible to read and write a variable simultaneously, therefore running jobs in a single instance will ensure the consistency of your shared variables because there is no simultaneous access.

Why Use Shared Variables?

Shared variables are a powerful concept and enable numerous use cases, we'll highlight a few of the most common:

Sequence Numbers

You can use shared variables to generate sequence numbers for files, database records, etc.  By simply creating a shared variable and incrementing it each time you need a new value you can easily generate sequence numbers.
Tip: It is important to make sure that you have a way to initialize your sequence number by reading the highest existing value, using a configuration property or by some other means because you may need to reset the sequence or advance it.

Last Updated Date

If you are polling for changes in a resource and for any reason are not using one of the polling connectors, you may need to keep track of the last record that was successfully processed.  A shared variable is a good way to keep track of such values.  Again, you will want to make sure that you have a way to manually set this date on startup or change the data in order to reprocess records.

Caching

Shared variables can also be used to cache values such as a session key for a web api.  You can cache any xml document so its possible to cache complex structures such as a lookup table.  Note that jobs will run sequentially for performance reasons you should be careful when to use this option vs caching values in a database or external system.

Monday, April 1, 2013

QT008: Quick References for Mapping with JavaScript

Quick Tip #008: Quick References for Mapping with JavaScript

JavaScript is a simple scripting language that uses a syntax that is derived from the Java Programming Language.  Because of its simplicity and pervasiveness, JavaScript is commonly used for mapping data in middleware platforms such as IBM Cast Iron and Boomi.  In this post we will run through a quick example in CIOS provide a couple of resources from around the web for more in depth study of JavaScript.

A Quick Example: Parsing Email Addresses

In this example, we will present a brief example for parsing the first email address out of a given string in Cast Iron.  Cast Iron provides a number of built in custom functions for performing numerous tasks.  These functions are available in the functions tab and can be dragged into any map to translate inputs from the source side to outputs on the target side.  If there is no suitable built in function you may write a custom function using JavaScript.  To do so, click the Add New Custom Function . . . link at the bottom of the functions tab.


There are two tabs, the first allows you to configure the name and return type as well as the input parameters for your custom JavaScript function.  For this example the return type will be a string and we only require one string input, which we will call input.  Click next to define the body of your custom function.


In Cast Iron, you define the body of a custom JavaScript function.  You can pass an arbitrary number of string, boolean, or number parameters to your function, but you may only return a single value.  Because of the nature of JavaScript functions, it is actually possible to nest functions within your function should you need to get creative.  See some of the tutorials discussed below for other advanced features of the JavaScript language.  In this example we will use the string function match() to apply a regular expression to the input string.  The match function applies a regular expression to a string and returns an array of all matches to the regular expression.  We create a variable to store this array and test to see if it contains at least one value.  If it does we return the first match, otherwise we return null.  It is as simple as that.  You may use your custom function in any map as you would use any of the built in functions.

References

Mozilla - The source for JavaScript, JavaScript was originally included in the Netscape Browser which became the the Mozilla Project.  Mozilla is the authoritative source for JavaScript and contains numerous resources including documentation and tutorials.
W3 Schools - W3 Schools provides tools and documentation for web technologies such as CSS, HTML, and of course JavaScript.
regular-expressions.info - One of the best guides on the web for working with and learning about regular expressions.

Monday, March 25, 2013

QT007: FTP Polling Considerations

Quick Tip #007: FTP Polling Considerations

FTP has been around since 1980 and still remains a popular protocol for integration.  The protocol is simple and widely adopted.  There are free open source implementations of the client and server for many platforms and the protocol is supported by major integrations platforms such as IBM Cast Iron, Dell Boomi, Informatica, and many others.

Active vs Passive FTP

There are two modes in which connections are established in FTP, Active and Passive.  In the original protocol which is now called Active mode, the client establishes a control connection to the server and uses the PORT command to tell the server which port to use when establishing a data connection to transfer files.  Such a protocol requires the client to be directly addressable by the server and therefore causes problems if the client is behind a firewall.  There are ways to use active FTP from behind a firewall, however, certain considerations if you are behind a firewall.  If the client has a public or an addressable IP address by the server, then you simply need to open a port to the client for the data connection and tell the client to pass that port when issuing the PORT command.  If the client has a private IP and your firewall uses Network Address Translation (NAT) the firewall may have a feature to enable passive FTP by proxying the PORT command and data connection.  If it does not, then the client and server need to support Passive mode.  In passive FTP, instead of issuing a PORT command, the client issues a PASSV command and initiates the data connection from the client side rather than the server side.

Security Concerns

Basic FTP does not use any form of encryption even for FTP passwords and therefore is not suitable when sensitive information is being transfered over public networks.  There are a couple of protocols that deal with this problem, both are in wide use today.

FTPS

FTPS is the secure implementation of the File Transfer Protocol.  It is an implementation of the entire set of FTP commands over a secure connection.  Again there are two modes, Implicit and Explicit.  Implicit FTPS is now deprecated, but as the name implies all traffic is sent over a secure SSL/TLS connection.  Implicit FTPS uses SSL/TLS to negotiate a secure connection with the client before any commands can be executed.  Explicit FTPS is the currently supported standard and allows one server to  provide both FTP and FTPS, by using the the AUTH SSL or AUTH TLS command the client can request a secure connection or use the standard AUTH command to request an unencrypted connection.  Obviously, users with access to sensitive data should be required to AUTH SSL or TLS and would be rejected if the do not.

sFTP

sFTP is the SSH File Transfer Protocol and is not strictly related to FTP, but implements a very similar command set and for most purposes from the users perspective is very similar.  This protocol uses the same security standards as SSH and is widely available on Unix platforms because most SSH Servers implement the protocol.

Avoiding Partial File Transfers

In FTP there is no standard way to lock a file to indicate that a transfer is in progress.  Many clients have unobtrusive ways of avoiding transferring a partial file.  IBM Cast Iron for example will check the file size before and after the transfer to see if it has changed.  If the file size changes during transfer then the client knows that the file was being transferred while it was being downloaded and it may not have received the entire file, in which case it will restart the transfer and repeat the process until it receives the entire file without the file size changing.  This system works however it is not foolproof, there really is no implicit way to know that the uploader is done with the file before it is downloaded.  There are however several easy ways to avoid this problem by having the uploader take specific action to indicate that the file is ready for download.  The first way is to rename the file after transfer.  If you are loading a file called my-file.csv, you can load the file as my-file.tmp and then rename the file once it has been loaded completely to my-file.csv.  This will ensure that the entire file is loaded before you try to download it.  Another solution is to use a control file.  A control file is a separate file that is loaded that indicates to the client which files are ready to be downloaded and may include some processing instructions such as what encoding was used for the file, etc.  A third option is to use a checksum file, by loading a cryptographic checksum file along with the file to be transfered you can ensure that not only is the file transfered completely, you can ensure that the file has not been corrupted.  

Monday, March 18, 2013

QT006: Understanding Character Encodings in CIOS

Quick Tip #006: Understanding Character Encodings in CIOS

Background

A character encoding scheme is a means of digitally encoding characters for electronic interchange and storage.  A character encoding translates the semantic meaning of a character system into a digital format, which is independent of how the characters are displayed.  A font, on the other hand, translates characters to glyphs that can be rendered on screen or paper.  There are a number of standard encodings that have been developed over the years to increase interoperability between systems, however, there is still no universally accepted character encoding scheme and therefore tools like Cast Iron support multiple encodings and provide the ability to translate between them.  Cast Iron supports a number of modern standards for encoding as well as a few legacy encoding systems that are still in use ocassionally.

ASCII

In the early days of computing, processors where designed to work with numeric data in 8-bit bytes.  A byte can encode 256 different values and that was plenty to support commonly used US characters.  Therefore, one of the first standardizations of an encoding scheme the American Standard Code for Information Exchange (ASCII) was born to encode 128 different character values including 26 uppercase and 26 lowercase letters, 10 digits, 33 punctuation and symbol characters, and 33 control characters.

Other Single Byte Encodings

Although 26 lower and upper case letters is sufficient for US English, there are actually other languages out there that use more and different characters.  There have been many attempts to create proprietary standards such as windows-1252 or EBCDIC from IBM.  There are also several encoding schemes from the International Standards Organization (ISO) to provide single byte character encodings for various character sets.  ISO-8859 is an extension of ASCII and uses the unused bit in the ASCII schema as well as replacing some of the control characters with printable characters.  ISO-8859 develops 16 different mappings that are useful for various languages, ISO-8859-1 for example is a single byte encoding for popular characters in Western European languages and is popular because it is backwards compatible with ASCII.

Multi Byte Encodings

Single Byte encodings are sufficient to cover languages where there are less than 256 common characters.  Some languages have thousands of characters.  Therefore, limiting characters to a single byte is not sufficient and a multi byte encoding system is essential.  In order to provide a broader standard for encoding characters the unicode standard was developed to encompass most of the known characters used in writing systems around the world.  Unicode uses over 1,000,000 code points to describe characters and can be encoded in various unicode transformation formats using up to 4 bytes.  There are two main standards in use today for unicode characters, UTF-8 and UTF-16.  Both standards seek to reduce the overhead of using a 4 byte code to represent each character by encoding the most commonly used characters with one or two bytes and expanding to up to 4 bytes to represent other characters.  UTF-8 uses the same encoding scheme as ISO-8859-1 for the first byte but can add additional bytes to represent the unicode characters not represented in ISO-8859-1.  UTF-16 uses two bytes by default to represent the most commonly used characters in modern languages, and is better suited for languages that would be forced to frequently use 3 bytes in the UTF-8 scheme such as Chinese due to the number of characters in common use.

Encodings in CIOS

Translating Encodings at the Endpoints

Because CIOS is a Java based platform the native encoding is UTF-16 and all operations are performed in this encoding scheme.  It is therefore necessary to translate data to this encoding when CIOS loads it from an endpoint.  For most endpoints you do have the option of deferring this translation and loading the data in binary format in which case it will be base 64 encoded and processed in the system as a base 64 encoded string.  Cast Iron supports translation to and from the following encodings: UTF-8, US-ASCII, SHIFT_JIS, EBCDIC-XML-US, ISO-8859-1, EUC-JP, and Cp1252.



You can even dynamically set the encoding in some endpoint activities.  This allows you to parameterize the input and output encodings by reading them from a flat file, database, or configuration property.



Translating the Encoding in Transformation Activities

Most of the Transformation Activities such as Read/Write Flat File, Read/Write XML, and Read/Write JSON allow you to specify the encoding in the Activity.  This functionality allow you to pass the Read activity a Base64 encoded binary message and specify the encoding in the configure step to translate the encoding and transform the data in a single step.  This can be helpful in cases where the encoding cannot be translated in the endpoint, such as data that is read from a BLOB in a database, or in cases where you need to support multiple encodings.



Again, the encoding can be set dynamically in the activity by showing the optional parameters and mapping an encoding parameter to the Encoding input.


MIME Messages

Initially, many Internet specifications required text to be encoded with ASCII characters, the Multi-Purpose Internet Mail Extensions (MIME) protocol was developed to allow other encodings and binary types to be sent over protocols designed with ASCII in mind.  The Read and Write MIME activities can be used in conjunction with the Email, HTTP, FTP or really any other connector to properly format and parse multi part MIME messages.  The most common scenario for using multi part MIME messages is in handling Emails with attachments.  It is in these cases that the dynamic controls for encoding in the various other activities can be very useful.  There are two headers that are important when understanding the encoding parameters of MIME messages, the Content-Type Header and the Content-Transfer-Encoding Header.  The charset parameter in the Content-Type header will tell you how text within each part of the message is encoded, while the Content-Transfer-Encoding will tell you how the binary data is encoded.  In most scenarios, the Content-Transfer-Encoding will be 7bit for ASCII text and Base64 otherwise, however, it is possible to have ASCII data that is sent with a base64 Content-Transfer-Encoding or in rare circumstances 8bit or binary Content-Transfer-Encoding (Most internet protocols are designed with 7bit printable characters in mind and do not allow raw binary data to be transferred).

Monday, March 11, 2013

QT005: Allocating More Memory for Cast Iron Studio

Quick Tip #005:  Allocating More Memory for Cast Iron Studio

By default, the maximum amount of memory available in Cast Iron Studio is 512MB.  Most of the time, that is more than adequate.   However, if you find yourself working with a large XML Schema in a map or testing a complicated XSLT, or for many other reasons you may need additional memory.

Background

Cast Iron Studio is Java application and like all Java applications, the amount of memory available is bounded by the Java Virtual Machine (JVM).  There are several JVM parameters related to memory that can be set at JVM startup time, typically the most important parameters are those related to the JVM heap.  The JVM is the long term / global memory used by the application and there are two parameters that are important here, the minimum and maximum size of the heap.  The minimum or initial size is the memory that is allocated when the JVM starts and the maximum size is the upper bound of the heap.  If an application tries to allocate more memory than the maximum heap size a java.lang.OutOfMemoryError is thrown.  For our purposes the minimum heap size can be left alone, the heap will grow automatically until the maximum is reached and the default value is typically fine for use with Studio.

How do I set the Maximum Heap Size for Cast Iron Studio?

First located the CastIronStudio.exe executable, it should be in the main folder where you installed studio.  Right click the executable and select create shortcut:



This will create a new file called Shortcut to CastIronStudio.exe, you will now need to edit the shortcut and specify the JVM Parameter to increase the heap size: [-J-XmxSSSSm].  Where SSSS is the new heap size in megabytes i.e. -J-Xmx1024m.  See the screenshot below:



Next you will probably want to switch to the General Tab and change the name to something more meaningful and indicative of the parameter that you set Such as "Cast Iron Studio 6.3.0.1 - 1024m" that way you will be able to distinguish the modified version from the original and will know whether or not you are using the larger heap size.

That's it, just double click the shortcut to run studio with a larger heap size.  Note: we demonstrated this setting with the latest version 6.3.0.1 on Windows XP, however, the process is the same for any install4j based version of Studio.  Also, the process on other versions of Windows such as Windows Server or Windows 7 is almost identical (the only change on Windows 7 is the name windows generates for the shortcut).

Monday, March 4, 2013

FR002: Job Keys

Feature Review #002: Job Keys

What are Job Keys?

Job keys are a useful utility feature in CIOS that allow you to tag values to each job that runs.  Tagging a job with a particular key allows you to search for that key on the WMC to find the job.  The primary key is also displayed in job list views in the WMC.

Using Job Keys

Creating and using Job Keys in CI Studio is a simple two step process:

  1. Managing Available Job Keys: Open your orchestration and click the green starter dot (see the screenshot below).  This will bring up the orchestration pane, in the first section you will see the list of job keys.  To add a job key click add.  There is a checkbox to make a particular key the primary key, note that only one key can be primary you will have to uncheck it to select a different key.  Select the key and click the Remove button to remove a key.
  2. Creating Job Keys: To create a job key simply use the Create Job Keys activity and map a value to the key that you want to create.  Note the name of the activity, Create Job Keys and not set job keys, indicates that each time you use this activity a new job key with that value will be created.  If you call it twice for a single key you will see two values for that key in the WMC after the job runs.   
Click the green starter dot to manage available job keys.

Use the Create Job Keys activity to set your job keys.

Design Patterns

Job keys allow your jobs to be searchable in the WMC.  Therefore, job keys are very useful when storing cross reference information.  For example, if you a writing an orchestration to sync accounts between SAP and salesforce it may be useful to store an account id, for example, as a job key to quickly allow you to see when the last time an account was synced.  In the same example, it may also be useful to store the IDoc number in order to trace an IDoc through the system.

The primary job key is what will be displayed in the WMC in job list views.  Therefore, setting a meaningful primary key will help you to distinguish one job from the next.  It is also a good place to display status information about the jobs that have completed.  This is especially true for batch jobs.  For batch jobs it is very useful to use the primary key to indicate how many items within a batch where processed successfully, had warnings or errors, etc.  To accomplish this all you need to do is add a job key called status and make it primary.  Then calculate the number of successes, warnings, and errors and map them in the Create Job Keys activity using a concatenate function to form a string like: Batch Job Complete. success: 5, warn: 2, error: 1.

Avoid overuse of this feature, as noted above once the create job keys activity is called the job key is logged permanently.  If you are processing hundreds or thousands of items within a batch job, it might seem like a good idea to log a key for each item, however, this can create a dramatic drag on the performance of your orchestration which may not be immediately evident (due to the fact that logging is asynchronous and indexing job keys by the logging system is an expensive operation).  Instead, use status summary design pattern to track aggregate status or an external database for logging if you need that level of detail for batch processes.

Thursday, February 28, 2013

CIOS 6.3.0.1 Is Now Available


Cast Iron Version 6.3.0.1

IBM has released a new version of CIOS version 6.3.0.1 was released on 02/28/2013.  See the Release Notes for more information.

What do you need to know about this Release?

This is a FixPack release, it rolls up the prior iFix releases including a security vulnerability for those customers using LDAP.  This release also contains some enhancements to connectors, if you are experiencing timeouts with the Netsuite connector there is a fix for that.  Also, there is an enhancement for the Domino Connector to allow you to specify the database as a configuration parameter. Finally, there is also an important fix for an issue that may cause the WMC to become inaccessible when the disk fills up.

If you are experiencing any of those issues with 6.3.0.0 or would like to try out the enhancements to the domino connector you can get the new release here.  For more information on minor version upgrades see this post.

Monday, February 25, 2013

QT004: Cast Iron Hypervisor Edition

Quick Tip 004: Installing IBM WebSphere Cast Iron Hypervisor Edition

In this post we will cover the steps necessary to configure VMWare for Cast Iron Hypervisor Edition.

Installing ESXi Server

In order to run Cast Iron Hypervisor edition you will need a physical machine running ESX or ESXi version 4.0.0, Build 164009 or later.  The ESXi software runs on a variety of platforms and is installed as the base operating system for the server, there is no need to install Linux or Windows.  If you do have an HA requirement, all you need is the ESXi Server to run Cast Iron Hypervisor Edition.  The installation process is extremely simple, download the version that corresponds to your hardware insert your bootable media into the machine and the installation will only ask you a few simple questions before installing and rebooting the machine.  Once the software is installed configure the network parameters and download vSphere Client.  vSphere Client is the tool which you will use to deploy the CIOS Virtual Machine and vSphere client is available as a download once you have installed ESXi server by navigating to the url presented at the end of the setup process.

Installing vCenter Server

If you have a requirement for High Availability and want to create a cluster of ESXi nodes, you will need to install vCenter Server. An ESXi Cluster is a set of ESXi servers that run in tandem and allow you to migrate VMs between nodes manually or automatically even while they are running.  vCenter Server is also very easy to install,  for small deployments you can even run vCenter Server as a virtual machine deployed to your ESXi Cluster.  vCenter Server is available for installation on top of Windows Server or as a linux based Virtual Appliance.

vMotion, vSphere HA, and Fault Tolerance

vMotion is the tool for migrating VMs from one node to another while they are running with no perceivable interruption to other systems.  vMotion will copy the memory of your system while it is running to another server and then pause the system long enough to capture a delta of the memory changes that happened during the copy process and then resume the machine on the new server.  vMotion will also take care of the networking changes in a way that is transparent above layer 2, so even network connections are not interrupted by vMotion.  vSphere HA is a technology that monitors servers virtual machines and even applications running on your VMs for failures and responds to failures by restarting VMs or moving them to another node in your cluster.  Fault Tolerance provides another level service that actually provides zero downtime by continuously replicating the state of your VM to a slave instance on another server, if the the master instance fails the slave will take over with virtually no interruptions.  The fault tolerance architecture most closely represents the level of service that the Cast Iron hardware based High Availability Pair is designed to provide.

Server and Network Requirements for vMotion, vSphere HA, and Fault Tolerance

In addition to the networks required to support your Virtual Machines you will need a separate physical network for vMotion and HA, and if you intend to use Fault Tolerance a separate set of NICs should be used for Fault Tolerance.  VMWare allows for redundant NICs so, if you want to use vMotion, vSphere HA and Fault Tolerance you will want a minimum of 8 physical NICs to create 4 separate networks (VM Management, VM Kernal for vMotion, Fault Tolerance, and finally the network for your virtual machines) 

Installing Cast Iron

Cast Iron Hypervisor Edition is distributed for VMWare as an OVA file which can be loaded using vSphere Client.  Follow these instructions in the Cast Iron documentation on how to load the OVA file.  For information on the 6.3.0.0 release see this post.  You will most likely want to apply the latest patches further information can be found here.  (Note: As of the time of this post, there appears to be a problem with the download for the latest iFix for CIOS 6.3.0.0, it appears that the Hypervisor Edition patch is actually for a physical appliance, you will likely have to contact IBM Support in order to obtain the latest patch.)

Monday, February 18, 2013

QT003: Migrating from a Pre-IBM Version of CIOS

Quick Tip 003: Migrating from a Pre-IBM Version of CIOS

In this post we will review some of things that you need to know if you are moving from a Dell based Cast Iron Solution to the latest IBM WebSphere Cast Iron.  We will cover some differences in terminology as well as the new choices available in the latest version of Cast Iron.  First, if you are still running a legacy version of CIOS and are looking to upgrade, there are a couple of new alternatives to the standard hardware based appliance:
Hypervisor Edition:  IBM Cast Iron offers a virtual appliance that is designed to run on VMWare or the open source Xen virtualization platform.  The Hypervisor Edition is a good solution for customers who are already making an investment in virtualization, it provides a simple solution for high availability as well.
Cast Iron Live: IBM Cast Iron also offers a cloud based solution for customers who don't want or need an on premise solution.  Cast Iron live does support a hybrid cloud solution with the Cast Iron Secure connector which is an agent that users can run within there private network to allow Cast Iron live to interact with local databases, web services, CRMs etc.
DataPower Appliance: The most direct upgrade path is the XH40 DataPower appliance which is the new IBM hardware based solution.  It works much the way the Dell based product worked, its a physical box that is for the most part a direct replacement for the Dell based solution.  HA Pair configurations continue to be supported by the DataPower appliances as well.

Upgrading Orchestrations

Cast Iron Studio automatically performs upgrades when you open a project from a prior version of CIOS.  Although its not strictly required, we do recommend stepping through major version upgrades to ensure maximum compatibility.  For example, when going from say version 4.5 to version 6.1, you can open and save the project in the latest version 5 studio before moving to version 6.1.
Some new features may not be available in upgraded projects until you refresh your connectors.  For example, the Salesforce connector was upgraded in version 6 to allow foreign key upserts.  For upgraded projects, you will not see the drop down for external id fields until you refresh the connector schema in studio.  This is because studio needs to connect to salesforce to discover and leverage external id relationships.

Going Cloud and Using the Secure Connector

Cast Iron Live is the cloud offering from IBM Cast Iron.  For the most part, projects and orchestrations built for other versions of Cast Iron are compatible with the cloud version.  The main exception being the obvious fact that on premise systems cannot be directly accessed from the cloud.  In order to connect to on premise systems, you will want to install the Cast Iron Secure connector.  (We will cover more about the Secure Connector, how to install it and how it works in a future post.)  Connector operations work much the same way in a hybrid Cloud / On Premise solution you simply need to tell the connector which secure connector to use when connecting to the resource.

Web Services in the Cloud

Cast Iron Live provides several new ways of exposing web services:
  • you can provide web services to your internal private network via the secure connector
  • you can also provide external web services authenticated using a query string in the Url. ( this will be covered in more detail in a later post on using Salesforce Outbound Messaging with Cast Iron Live.)
  • webapi.castiron.com is the latest tool from IBM Cast Iron for exposing web services to third parties. ( This will also be covered in more detail in a later post.)

Hypervisor Edition and High Availability

If you are migrating from a High Availability Pair you may want to consider the new virtual appliance as it is fully compatible with VMWare vCenter High Availability.  With HA and vMotion it is possible to load balance and provide high availability in your virtualization cluster.  VMWare High Availability continuously monitors both host and guest operating systems in a virtualized environment and will detect failures in real time and move virtual machines to a healthy node in your cluster.  CIOS also provides a persistence layer which allows it to seamlessly recover in the event of such a failure.  When VMWare moves the CI appliance to a new node in the cluster, CIOS will attempt to resume all jobs from their previous state. Do note that ability to recover in flight transactions does require support from the endpoint and varies by connector. It is therefore advisable to ensue that interactions with your endpoints are idempotent because if a transaction is in process when a fail over occurs it will likely be restarted.  (We will cover configuring VMWare for the CI Hypervisor Edition in a future post.)

Notes for Power Users

If you have ever used the the system shell command on the Dell based appliance you may be disappointed to know that you are no longer allowed to access the linux shell.  There are some new commands in the CLI to access the system logs, however for security reasons, there is no longer access to the shell.  See the online help for the CLI for more details on the new CLI commands for Debugging.
Also, somewhere along the way, there was a change in CIOS to use the internal persistence layer as a paging store for large documents, it is therefore not advisable to turn persistence off for any orchestrations unless you really understand the consequences.  If you had been running with persistence off in a legacy version, you will want to turn it back on when upgrading to avoid potentially dramatic performance degradation.

Monday, February 11, 2013

FR001: Support for RESTful Web Services and JSON in CIOS 6.3.0.0

Feature Review #001: Support for RESTful Web Services and JSON in CIOS 6.3.0.0 

Starting with CIOS Version 6.3.0.0 IBM Cast Iron has expanded support for RESTful web services including new activities for parsing the JSON message format.  Web APIs built around RESTful architectural patterns are becoming more and more prevalent in Cloud Based IT Systems.

First a Little bit of Background on REST and JSON

REST

REpresentational State Transfer (REST) is a concept that was laid out by Roy Fielding one of the authors of HTTP in his Ph.D. dissertation.  It is the now ubiquitous architectural pattern that the modern Internet is built around.  Chapter 5 of Fielding's dissertation is the best resource for the clinical definition of REST.  However, REST is an architectural pattern not a protocol and the REST architectural pattern is not limited to the modern interpretation of a RESTful web service.

JSON

JavaScript Object Notation is a message format that is derived from the JavaScript syntax for defining objects.  JSON is often used in Web APIs because it is a lightweight human readable alternative to XML which does not require a parser to be interpreted in JavaScript.  Because of those properties it is a very popular message format for Web APIs that are intended to be called from a web browser.

Modern RESTful Web APIs

Whereas SOAP based Web Services strive for interoperability through rigidity, RESTful Web APIs strive for interoperability through simplicity.  Rather than using a complex well defined protocol like SOAP, RESTful Web APIs use simple message formats based on XML or JSON that are exchanged using standard HTTP methods.  This lightweight simplicity grew from necessity, modern Web APIs evolved as a means for building web applications in JavaScript,  so they can easily be called from web browsers and mobile platforms.  Because RESTful APIs use simple HTTP requests and self describing data formats (typically XML or JSON) and because there is as of yet no formal standard for documenting these APIs, working with RESTful APIs is often an autodidactic experience.  If you need to know what fields a call will return you can make the call and see what comes out, etc.


SOAP based Web Service are typically implemented on top of of the HTTP protocol (There are some platforms that support SOAP over JMS or even SMTP, but by nature they are a level on top of the underlying transport).  RESTful Web APIs are implemented at the HTTP layer rather than on top of it.  They allow clients to interact with the server using the features if HTTP, including HTTP methods (GET,  PUT,  DELETE,  POST, HEAD, OPTIONS, etc.), URIs, query strings, headers, etc.
  • URI: Web APIs generally use Uniform Resource Identifiers (URI) to identify resources that a client can interact with.  Addresses are typically hierarchical and are used to identify a class of resources with its name, and a single entity underneath it by its unique identifier.  For example you my have something like /books/{isbn}, or even a deeper hierarchy /library/branches/{branch-id}/books/{isbn}, where branch-id uniquely identifies a branch within a library system and isbn uniquely identifies a book.
  • Methods: Generally, the actions that can be performed on a resource are mapped to HTTP methods.  GET to retrieve a resource, PUT or POST to create or change a resource, DELETE to delete it, etc.  See RFC2616 Section 9 for more details on HTTP/1.1 methods. 
  • Query Strings: Query strings are the standard way to pass parameters in HTTP GET requests.  A query string is the part of the URL between the ? and either a # or the end of the URL.  And typically take the form name=value&name2=value2.  Values passed via query string must be URL encoded.  See RFC3986 for more details on URLs. 

What's New in Version 6.3?

Version 6.3 includes several enhancements to existing activities as well as new activities to better support RESTful Web APIs. 

New Transform Activities for JSON


Read JSON Activity

The Read JSON Activity parses JSON input and converts it to xml.  This new activity works much the same way the Read XML or Read Flat File Activities work.  You select a schema and then you map a flat string into the Map Inputs, the Map Outputs will come in the form that you specified.  For the Read JSON activity you also have the option of specifying a sample JSON or XML message and the activity will determine whether your have specified JSON or XML, parse the sample message and learn the schema.  See below for an example of specifying a sample JSON message:


The sample message functionality also supports sample XML messages.

Write JSON Activity

The write JSON activity converts XML to the JSON format.  Much like the Read JSON activity you can specify either an existing schema in your project or a sample message.  Below is an example of an XML sample message:

New HTTP Activities


Invoke Activities

As you can see above there are a lot more HTTP activities than there were before.  These new activities are not fundamentally different than there counterparts in previous versions of Cast Iron.  However, the new activities retain the functionality that was there before while adding new features that make writing RESTful Web APIs much easier.  The Invoke Request activity is the general HTTP Request activity and functions much like the old HTTP Post Request Activity with some usability enhancements for path parameters, query strings, and HTTP Headers, see the Receive Request below for more details on what the changes look like and how they are used.  The Get Request, Post Request, Head Request, Put Request, and Delete Request all function similarly to Invoke Request, however there is no need to specify the HTTP method as it is implied by the activity name.

Receive Request

The Receive Request activity has been updated to more easily support RESTful Web APIs.  Most of the functionality was there already but there have been a number of usability enhancements that make it easier to support things like path parameters, query strings, and HTTP Headers.  See the screenshots below for examples of the changes and how they are used:

The new configure pane allows users to specify path parameters and query string values as well as allowing the user to choose which methods this activity will listen on.  Here we specify a path parameter isbn, as well as the query parameters title and year.
The Request Headers Pane allows users to specify headers to give easier access to the headers  that clients specify.  Here we add two headers, username and token, which can be specified by clients calling this service.


Finally, putting it all together you can see that the  username and token headers are exposed in the httpheaders section, the isbn path parameter is exposed in the new pathParameters node and the query parameters title and year are exposed in the queryParameters node.  There is no longer a need to write javascript to parse query strings, or xpath expressions to search for headers.

Send Response

The Send Response activity functions much as it did before, with some usability enhancements for setting response headers.

Putting it all Together

These new activities are not major changes to how Cast Iron works, but all in all they provide much needed functionality for RESTful Web APIs.  The HTTP enhancements provide huge usability upgrades that make it much faster and easier to deal with path parameters, query strings, and HTTP headers and methods.  The Read and Write JSON activities supply the necessary tools for dealing with the JSON format which along with XML are the standard message formats in RESTful Web APIs.  All of these small improvements lead to a long awaited major gain in the ability to call and and provide Web APIs.