Wednesday, February 03, 2016

Processing large files in Oracle SOA


Recently i was wokring on a requirement to process very large payload. Though now we have MFT feature which we can use to process the files, still we will go ahead and see the three approach to processing it- that is through BPEL,OSB and MFT

For BPEL and OSB it is going to be the same concept that is to use chunking, we will see a demo for BPEL and the same you can replicate in OSB as well.

Before starting i will just clarify that chunking is differnt than debatching. when you debatch your file you actually create multiple instances for the file. However when you chunk read you actually read your whole file in chunk within a single instance. This is a confusion many people have so i just thought of clarifying it. Now with that we will go ahead and see how to create a BPEL process to chunk read the file. Further in chunk read the file does not get deleted once the file is read completely so we will also see how we can achieve that as well.

I will give some details on the implementation however the working code is already provided by oracle at following location.

https://java.net/projects/oraclesoasuite11g/downloads/directory/Adapters/File

Infact i can see that the solution is already provided in details in followin blog

https://technology.amis.nl/2014/05/07/processing-large-files-through-soa-suite-using-synchronous-file-read/

With this sample code and the blog you should be able to create a sample for chunk read easily.

I will just add the extra part of deleting the file from the file polling location.

If you will implement the chunk read you will find that the file does not get deleted post reading the file.

It is because the file can be deleted only outside the loop of chunk read.

File adpater provides feature to delete the file from a location.

Provided you are getting the file name and file location you easily create an adapter to delete the file post chunk reading

Create a simple file adapter with sync read option

Once it is created, a jca file will be created.

Open the jca file and update the changes as shown in the diagram



the class name is oracle.tip.adapter.file.outbound.FileIoInteractionSpec

connect the adapter using an invoke activity.

After connection the process will look like following




Since we have defined logical name and logical directory now pass on the same in the adapter call

Add two properties in the invoke acitivity

jca.file.TargetDirectory

jca.file.TargetFileName



Add an assign activity and copy the file name and directory to the variables



Having said that there are situations where in you will use MFT as well.

I have worked in a situation where in the files were place in some shared drive which was location in a different server.

I had to explicitly use MFT to transfer the file from remote NAS path to local soa server.

Even if we implement a process using MFT we have to make a local copy of the file in the server and a local path is required which is more or less a chunking mechanism (MFT internally uses chunking to process data if you are using MFT adapter).

SO i better thought i will stick to my chunking process and will just use MFT to transfer the file from Shared drive to local path.


5 comments:

Mahesh Padamatinti said...

Hi, Glad to read your post, I have one question here.

After chunk reading whole file, is it possible to process that much large payload with SOA layer, like bpel >> mediator >> connector bpel, does it have any performance issues.

Mahesh Padamatinti said...

Hi, Glad to read your post, I have one question here.

After chunk reading whole file, is it possible to process that much large payload with SOA layer, like bpel >> mediator >> connector bpel, does it have any performance issues.

SOA said...

Mahesh,

In my case, since i was doing chunk reading the transformation also was happening for smaller chunk of files and at the end i was appending all the transformed output. This final output i was using to call a package in DB. In case of chunk reading this was very smooth and i didn't see any performance lag. Just to give you more idea on this, I tried to process a 1MB file without chunking and it was taking around 8 minutes (you have increase time out) just for transformation of file. However in case of chunk reading my whole process gets completed in 30 seconds. If you are planning for chunk reading there are few things you should know beforehand

1> Chunk reading doesn't allow to re read the same file name, so if you want to reprocess the same file you need to change the name for the file. It happens because it creates a reference in server. Weblogic has its own cycle to delete these reference --> I am yet to find how it does it.

2> It does not move file by itself you have to move the file manually. Again when you move file the timestamp of the moved file will be same as that of time when it was placed in the server and will not be equal to the time when it actually moved the file. Further when archive happens it automatically append a unique timestamp key after the file name and you might need to manually gerenate this key and append in the file name

Ravdeep said...

If BPEL is used for Chunking, we have "To Properties" and "From Properties" in Invoke to set the values for chunking in the loop.

Do you know where can we set these in OSB for chunking loops? -- (jca.file.LineNumber, jca.file.ColumnNumber, jca.file.IsEOF, jca.file.NoDataFound)

adrian awadis said...

Thanks for sharing this great information I am impressed by the information that you have on this blog. Same as your blog i found another one Oracle SOA . Actually I was looking for the same information on internet for Oracle SOA and came across your blog. I am impressed by the information that you have on this blog. It shows how well you understand this subject, you can learn more aboutOracle SOA . By attending Oracle SOA Training .