Delaware Web Designers – Inclind, Inc Internet Professionals | Inclind, Inc – Delaware Web Designers – Professional Delaware Web Design Since 1999

Dec/09

18

Sync Drupal Content Using Services and xmlrpc()

Have you ever wondered how to push custom content from one Drupal site to another Drupal site?

There are a handful of ways to do this. One way would be to create an external database connection, talk to it, and update the data in our database through raw PHP. But that’s not really good, not to mention slow, and also very un Drupal.

Another way would be to utilize Domain Access, and publish content to affiliated sites. That works in some cases, but what if these sites are independent from each other with different companies managing them? You would then have the nightmare of dealing with prefixed tables, back-end training issues, and the occasional node overlap from misconfiguring Domain Access.

A third way would be to utilize FeedAPI (or its successor, Feeds), to read from an RSS feed. Then you could parse and import that content at regular intervals. Sounds great, but if you plan on importing custom node types that have extensive CCK fields, files and images, prepare to sit down and code plugins and parsers galore to support CCK as data sources to target.

The third way is the one I thought I could get working. It seems so simple in theory that you can create an RSS/XML/JSON data structure with Views, and then tell Feeds to take that feed and parse it. True, it works if you are using a basic content type like Story or Page, but all bets are off once CCK comes into play- and who doesn’t use CCK these days? Hats off to Alex Barth / Development Seed though on Feeds, its a great start and sure to grow into a monster data consuming module. I do want to use for future projects, just not for this function.

One way that most people are not aware of is to take advantage of Drupal’s XML-RPC functions through the Services module. In short, the Services module provides:

A standardized solution of integrating external applications with Drupal. Service callbacks may be used with multiple interfaces like XMLRPC, JSON, REST, SOAP, AMF, etc. This allows a Drupal site to provide web services via multiple interfaces while using the same callback code.

So, then we got the idea to have the ‘master’ Drupal site act as a SOAP server with the Services module, and provide our own custom services in order to get the job done. This runs once an hour, requires no user interaction, fails silently, and only requires two modules. Effectively, it also allows us to not have to use:

  • Domain Access
  • FeedAPI/Feeds
  • Multi-site Setup

The benefit of this is every site can run independently from one another and be customized various ways, while still receiving key content from the parent website. Thus is the beauty of XMLRPC/SOAP.

The next part was to create my own custom method so I can request a list of node ID’s from the parent server. There is no method of getting all node ID’s out of the box, but you can easily create them. Here is what I came up with:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
<?php
 
// CODE ON PARENT SERVER
 
function homes_service_service() {
	return array(
		array(
			'#method' => 'node.getAllHomes',
			'#callback' => 'homes_service_node_get_all_homes',
			'#return' => 'array',
			'#help' => 'Return a list of node id\'s that are of the Home content type.',
		),
	);
}
 
function homes_service_node_get_all_homes() {
	$result = db_query('SELECT nid FROM {node} WHERE type = "%s"', 'homes');
 
	while ($home = db_fetch_array($result)) {
		if ($home['nid']) {
			$homes[] = $home['nid'];
		}	
	}
 
	return $homes;
}
 
// CODE ON REMOTE SERVERS
 
function homes_sync_get_node_list() {
 
	// user authentication code here
	// connect as 'services' user with 'services' role
	// that way, drupal permissions are respected
 
	// user.login method used
	// we get a successful login if the return is an array and the array values match our login information
        // this needs a little more work so the parent server knows exactly who is requesting information       
 
	$user = 'user';
	$password = 'password';
 
	$authenticate = xmlrpc('http://upgrade.beracahhomes.com/services/xmlrpc', 'user.login', $user, $password);
 
	if (is_array($authenticate) && $authenticate['user']['name'] === $user && $authenticate['user']['status'] == 1) {
 
		$node_ids = xmlrpc('http://www.parentsite.com/services/xmlrpc', 'node.getAllHomes');
 
		if (xmlrpc_error()) {
			$error = xmlrpc_error();
			watchdog('homes_sync', 'Error getting node list from parent server. Error: @error.', array('@error' => $error), WATCHDOG_CRITICAL);
		} else {
			foreach ($node_ids as $nid) {
				$nodes[] = $nid;
			}
			variable_set('parent_home_nodes', $nodes);
			watchdog('homes_sync', 'Successfully retrieved node list from parent server.', array(), WATCHDOG_NOTICE);
		}
	}
 
	homes_sync_perform_update();
}
 
function homes_sync_perform_update() {
 
	$node_ids = variable_get('parent_home_nodes', 0);
 
	foreach ($node_ids as $nid) {
		$data = xmlrpc('http://www.parentsite.com/services/xmlrpc', 'node.get', $nid);
 
		$result = db_fetch_array(db_query('SELECT n.nid, n.title, n.type FROM {node} n WHERE n.title = "%s" AND n.type = "%s"', $data['title'], 'homes'));
 
		if (xmlrpc_error()) {
			$error = xmlrpc_error();
			watchdog('homes_sync', 'Could not perform XMLRPC request. Error: @error.', array('@error' => $error), WATCHDOG_CRITICAL);
		} else {
			if (is_array($data)) {
				$node = "";
 
				if ($result && $result['nid']) {
					$node->nid = $result['nid'];
				}
 
				$node->type = $data['type'];
				$node->uid = 1;
				$node->status = $data['status'];
				$node->created = $data['created'];
				$node->changed = $data['changed'];
				$node->comment = $data['comment'];
				$node->promote = $data['promote'];
				$node->moderate = $data['moderate'];
				$node->sticky = $data['sticky'];
				$node->tnid = $data['tnid'];
				$node->translate = $data['translate'];
				$node->title = $data['title'];
				$node->body = $data['body'];
				$node->teaser = $data['teaser'];
				$node->format = $data['format'];
				$node->name = $data['name'];
				$node->data = $data['data'];
				$node->path = $data['path'];
				$node->field_type[0]['value'] = $data['field_type'][0]['value'];
				$node->field_number_of_bathrooms[0]['value'] = $data['field_number_of_bathrooms'][0]['value'];
				$node->field_number_of_bedrooms[0]['value'] = $data['field_number_of_bedrooms'][0]['value'];
				$node->field_number_of_floors[0]['value'] = $data['field_number_of_floors'][0]['value'];
				$node->field_square_footage[0]['value'] = $data['field_square_footage'][0]['value'];
 
				node_save($node);
 
				unset($node);
			}
		}	
	}
}
 
function homes_sync_cron() {
	homes_sync_get_node_list();	
}
?>

For some reason, I can’t have two instances of wp-syntax in a single post, so bear with me. The above code is part of two seperate modules, one on the parent server, and one on the remote server (as notated with the PHP comment).

The module code (after CODE ON PARENT SERVER) resides on the parent server. This uses hook_service to talk to Services, and says expose node.getAllHomes as a request. That request method then calls the homes_service_node_get_all_homes function, which does a SQL query returns an array of node id’s that I am looking for.

I could easily return all nodes as their full node objects, but for performance reasons, I’d rather get a short list and save them on the receiving end. That way, I can create/update a handful at a time instead of all at once, which lightens the load on the database and application server.

On the receiving end, we need some code that creates the request that is sent to the parent server. Using hook_cron, I can send this request on an automated basis. The module code (after CODE ON REMOTE SERVERS) looks for the node list locally, and constructs single requests to retrieve node data one at a time. From there, it constructs a node object and saves it all with node_save. If the node already exists, based on type and title (our node ID’s will not match, but our title certainly will, since child sites cannot create or edit these nodes) it grabs the node ID from the local database and puts that in with the node object. node_save is a great function that can handle creating or updating data with the same structure. So, if the update runs again and passes us the same data, it will recognize it already has it, and update the record instead of create duplicates. Slick.

So what do we have? If we deploy the remote code on multiple remote sites, they can all sync up specific content with the parent website without anyone having to do anything special. So long as the parent site admin provides content, everyone will get it.

This is a quick implementation of course. I am fleshing out the authentication further as well as staggering the amount of data updated every cron run. With the core functionality in place, I can focus on security and speed. The next step is retrieving files and images using the same methods, and I will go over that in another post when I get a chance.

I hope fellow Drupalers found this useful, because I found that documentation on this is touch and go (and probably a reason not many utilize XMLRPC/Services). It’s a very powerful feature.

RSS Feed

11 Comments for Sync Drupal Content Using Services and xmlrpc()

Sync Files in Drupal Using Services and xmlrpc() - Delaware Web Designers – Inclind, Inc Internet Professionals | January 5, 2010 at 9:24 am

[...] a previous entry we explored content syncing/distributing using the Services module and XMLRPC in Drupal. We learned [...]

Author comment by bwinett | July 20, 2010 at 4:14 pm

Excellent post. Two questions:

1) Looks like you have to create 2 modules for every content type.
2) How would you handle embedded images, links to files on the server, and attached files?

Author comment by Kevin Quillen | July 21, 2010 at 8:05 am

1. Not necessarily. You just need a service to call and functions to handle the response. As long as you have that, you can have multiple service functions handling various types of content. My functions were just specific to handle this one use-case with a content type. If you have multiple content types where the only real difference is the name (and no CCK), you could write a more abstract data retriever and get the same functionality.

2. There are a few ways you can do this depending on the approach. You can use some of the file.get() services included in the core Services module, or write your own. If it is attached to the node, node.get() should be able to tell you there are files attached. In the case that they are linked within the content, you might have to parse the body text for img tags, and do file_get_contents() if you find anything matching the file system path. I knew that wouldn’t be the case with what I was working on, so I did not have to go that far.

The great thing about the Services module is its flexibility to do just about anything data-wise.

Author comment by trevor.james | July 21, 2010 at 9:50 am

Hello! I’ve enabled this code via 2 modules on 2 sites. I have the parent site using the first section of code you supplied as a custom module in /sites/all/modules/services/services/homes_service.

I then implemented the homes_sync module in /sites/all/modules on my other site.

Everything enables. When I run cron on the child site however I’m getting the following error and the Homes content is not populating over from my Parent site. Any idea what could be causing this?

Status report
warning: Invalid argument supplied for foreach() in /var/www/msde2/sites/all/modules/homes_sync/homes_sync.module on line 41.

I look forward to hearing back from you soon.

Best-

Trevor

Author comment by Kevin Quillen | July 21, 2010 at 1:34 pm

Looks like the foreach has nothing to loop over. I don’t see anything immediately wrong in the module, though since the post has been written, some of the code has changed.

Author comment by trevor.james | July 21, 2010 at 2:34 pm

Oh ok – I deleted the lines that you had for your specific fields thinking I didn’t need those since I didn’t have fields in the content type (just the default body and title) but let me try adding them back and see if I can fix it that way.

Is that what you mean byt the foreach has nothing to loop over?

Best-

Trevor

Author comment by trevor.james | July 21, 2010 at 2:51 pm

Something seems odd though – I do have nodes published using this content type and the foreach is looping over those published nodes correct?

-Trevor

Author comment by trevor.james | July 21, 2010 at 2:59 pm

I added the fields to the CCK and then added the $node->field_ lines back into the homes_sync module. Still not working.

-Trevor

Author comment by trevor.james | July 21, 2010 at 3:03 pm

If I run the node.getAllHomes via the Services interface on my parent site it does retrieve the correct array of node IDs so that module seems to be working:

Result

Array
(
[0] => 1452
[1] => 1453
)

Author comment by trevor.james | July 21, 2010 at 3:24 pm

I think it’s a permissions issue. Looks like XML-RPC is only accepting POST requests. In the parent site browser if I go to xmlrpc I get this:

XML-RPC server accepts POST requests only.

And then if I try getting the nodes on the child site via the Devel module I get this:

stdClass Object
(
[is_error] => 1
[code] => 1
[message] => Access denied
)

How can I configure the one site to accept GET requests?

Author comment by bwinett | July 22, 2010 at 1:55 pm

Thanks for your response and info. Just thought I’d let you know I found your follow-up article discussing my files question:

http://www.delawarewebdesigner.com/how-tos/sync-files-in-drupal-using-services-and-xmlrpc.htm

Good stuff!

Leave a comment!

You must be logged in to post a comment.

<<

>>

Contact Us

Web: http://www.inclindinc.com
Phone: 302-856-2802
Email: info@inclind.com
Twitter: @inclindinc

About Us

Inclind Inc is a web development company on the east coast specializing in web design, web hosting, custom website design, website design, web applications, Coldfusion development, database design, MySQL / MSSQL database & consultation, PHP development, Wordpress themes, iPhone application development, Drupal hosting, Drupal development, Drupal module development, logo branding, business logic, custom application programming, Linux and Windows Server management and more.

Visit our website at inclind.com

Theme Design by devolux.org
delaware's premiere web design and development company. serving delaware, maryland, pennsylvania and beyond.
delaware web design - delaware website design - delaware web development - maryland web design - maryland website design - maryland web development - ocean city maryland web design - philadelphia web design - philadelphia web designers - washington dc web development - washington dc web design - custom drupal templates - drupal module development - custom drupal modules - custom drupal development - drupal ubercart support - drupal ecommerce