Background
HP Asset Manager SaaS data need to be transferred across customer’s internal network without too much impact on the bandwidth between the two VPNs. The agreed solution is to replicate data in Oracle to a Microsoft SQL Server database in a local network. Any new records will be inserted into the new database while existing ones will be updated.
The steps below are the quick way to setup an upsert (update and insert) SSIS package that will replicate content of a table in one database platform to another based on the last modified date on a scheduled interval. While the example is very specific, the concept can easily be adopted for other purposes.
The assumption is all Data Sources are already setup including Oracle Data Provider.
In this example, SSIS is created to transfer data from amPortfolio table, which is one of the most active tables in HP Asset Manager.
One table called QueryTime was setup just initially, this table is used to store the time of when the last query was pulled for that particular table.
CREATE TABLE QueryTime ( Id INT IDENTITY(1,1) NOT NULL, TableName NVARCHAR(20) NOT NULL, QueryTimestamp DATETIME NOT NULL, CONSTRAINT PK_QueryTimeId PRIMARY KEY NONCLUSTERED (Id) )
Steps
-
Create a new DTSX package (i.e., Portfolio.dtsx). Since this will be deployed using Package Configurations, the ProtectionLevel in Properties needs to be set to DontSaveSensitive.
-
Create two variables qTimestamp and sqlPortfolio with DateTime and String data type respectively. The qTimestamp is to store the value pulled from the QueryTime table and sqlPortfolio is used to store the dynamic Oracle SQL in order to pull records since the last query timestamp.
The initial value of sqlPortfolio in essence is a simple select statement from the amPortfolio table in the Oracle DB.
SELECT * FROM AMPORTFOLIO
Except with all the fields listed out, which is the better way to query.
-
From the Toolbox, drag Execute SQL Task, Script Task, Data Flow Task, and another Execute SQL Task (1) to the Control Flow and connect the boxes by dragging the green line in the order shown below.
-
Double-click the first Execute SQL Task to open its editor. In the General section, change the Name to “Get Last QueryTime.” Select the Connection, set the ResultSet to Single row, and input the following SQLStatement to get the last query time.
SELECT QueryTimestamp FROM dbo.QueryTime WHERE (TableName = 'AMPORTFOLIO')
-
Still on the same active window, click on Result Set and on the right box, type in “QueryTimestamp” under the column Result Name and under Variable Name, choose the variable User::qTimestamp. Basically we take the value from column QueryTimestamp and store it in qTimestamp variable.
Click OK. -
Click once on the text “Script Task” and change the name to “Compose Oracle SQL” just to make it more meaningful. Double-click on the box to open the editor. In ReadOnlyVariables, if you click on the empty textbox, there should be an ellipsis (…) button. Click on the ellipsis, and select
User::qTimestamp
variable. Do the same for ReadWriteVariables but use variableUser::sqlPortfolio
this time. -
Click on Edit Script… and place the code in the Main() function.
public void Main() { string sqlCmd = string.Empty; string sqlPortfolio = Dts.Variables["sqlPortfolio"] .Value.ToString(); DateTime dTimeStamp = Convert.ToDateTime( Dts.Variables["qTimestamp"] .Value); string timeStamp = null; if (!string.IsNullOrEmpty(dTimeStamp.ToString())) timeStamp = dTimeStamp.ToString("yyyy-MM-dd HH:mm:ss"); else timeStamp = DateTime.Now.ToString("yyyy-MM-dd HH:mm:ss"); sqlCmd = string.Format("{0} WHERE TO_CHAR(DTLASTMODIF, 'YYYY-MM-DD HH24:MI:SS') >= '{1}' ORDER BY LPORTFOLIOITEMID", sqlPortfolio, timeStamp); // Store Oracle SQL in User:sqlPortfolio variable Dts.Variables["sqlPortfolio"].Value = sqlCmd; Dts.TaskResult = (int)ScriptResults.Success; }
Close the window and click OK.
-
Double-click on Data Flow Task or just click on the Data Flow tab.
-
Open the Toolbox and from Data Flow Sources, drag OLE DB Source into the Data Flow designer surface and name it amPortfolio to reflect the table name it’s pulling data from.
-
Open the Source Editor and select the OLE DB connection manager. Data access mode: is set to SQL command from variable and for Variable name select User::sqlPortfolio, and click OK.
-
Next, from the Toolbox, under Data Flow Transformations drag Lookup to the surface. Open the editor and set it according to these settings:
General
Cache mode: Full cache
Connection type: OLE DB connection manager
Specify how to handle rows with no matching entries: Redirect rows to no match outputConnection
OLE DB connection manager: Connection to the destination database.
Use results of an SQL query: with the following SQL statement:SELECT LPORTFOLIOITEMID FROM PortfolioItem ORDER BY LPORTFOLIOITEMID
Columns
Drag the LPORTFOLIOITEMID from Available Input Columns on the left box to the Available Lookup Columns on the right and click OK.
-
Drag OLE DB Destination, name it Insert and drag OLE DB Command and name it Update.
-
From the Lookup box, drag the green line arrow to Insert and a dialog box will pop up. Select the Lookup No Match Output from the drop down list and click OK.
-
Click on the Lookup box again and drag a new green line arrow to Update, and it will automatically assign the Lookup Match Output to it as shown on the picture below.
-
Double-click Insert and set Data access mode to Table or view and select the corresponding destination table from the drop-down list for Name of the table or the view. If the table does not exist yet, you can click on the button New… and SSIS will guess the table schema based on the source table.
The auto-generated schema is generally accurate, except the table name itself, so make sure you review the SQL first before you commit by clicking OK. In this case, the table name was changed from [Insert] to [PortfolioItem].
After the correct table is selected, click OK.
-
Double-click the Update and use the following settings:
Connection Managers
Select the Connection Manager from the drop-down list.
Component Properties
SqlCommand
Click on the ellipsis and input an update SQL command (shortened in this example) below:UPDATE [dbo].[PortfolioItem] SET [BCREATEDONTHEFLY] = ? ,[DTLASTMODIF] = ? ,[ASSETTAG] = ? WHERE [LPORTFOLIOITEMID] = ?
Column Mappings
This is the most tedious part of the setup. All the question mark in the SQL Update statement is translated into Param_# (i.e., Param_0, Param_1, Param_3, etc.) on the Destination Column. The order of the fields in the SQL statement determines the sequential number of the Param_#. So in this case, [BTCREATEDONTHEFLY] maps to Param_0 and the last field in the statement [LPORTFOLIOITEMID] is mapped to the last field in the Destination Column (i.e, Param_68). It’s crucial that you don’t miss a field or transpose the order.
Once this part is done, click OK.
-
Go back to the Control Flow tab and double-click on the last Execute SQL Task (Update QueryTime). This time we just need to store the new timestamp to the QueryTime table based on the latest record processed.
On the SQLStatement use the following SQL statement:
UPDATE dbo.QueryTime SET QueryTimestamp = (SELECT MAX(DTLASTMODIF) AS Expr1 FROM dbo.PortfolioItem) WHERE (TableName = 'AMPORTFOLIO')
Click OK.
Build the package and it should be ready to deploy. The final Control Flow should look like this.
Caveat
-
In order to get the package setup quickly, QueryTime table must have an initial value, at least with a table name (i.e., AMPORTFOLIO). To include all records initially, just put in 1900-01-01 and the date will be updated after the first run.
INSERT INTO QueryTime (TableName, QuertyTimestamp) VALUES ('AMPORTFOLIO', '')
- If the database has a large volume of tables, this solution might not be practical.
Further Reading
How to Create SSIS Package in Visual Studio 2017
How to Build SSIS Package for Different SQL Server Version with Visual Studio 2017 and SSDT
Leave a Reply