Sphider With Sponsored Links

After some digging around, I decided that Sphider would be a good foundation for a personal project, a pet and animal related search engine ( http://www.petadvantage.com ).  To monetize the traffic I use  the inClick Ad Server ( http://www.inclick.net/ ), a pay-per-click ad server product.

The goals were:
	A. Serve the top three ads as Featured
	B. Serve the remaining results as Sponsored
	C. Keep it fast.

********************************
Phase 1: Install the Ad Server.
In order to offer sponsored links, you will first need to deploy the inClick Ad Server.  I'm sure others can work, but this mod is based on the inClick product.  Besides, once you have the ad server up and running, you can not only start selling ad space on your search results, but allow others to show ads for you in and "Ads By" network.

********************************
Phase 2: Update Sphider Code
Now that you have a running ad server, you need to modify the Sphider source code. The following is based on the standard template and could be easily modified to fit your needs.

1. Add the following to the end of searchfuncs.php just above the closing PHP tag:

class inResults {
	var $inClickClientId;
	var $inClickAdChannel;
	var $inClickAdKeyword;
	var $inClickAdCount;
	var $inClickXMLLocation;
	var $inClickUserIP;
	var $inClickAdOffset = 0;
	var $inClickReferringURL;
	var $inClickLimited = 0;
	var $inClickOffset = 0;
	var $inClickContextMatch = 0;
	var $inClickSEMURL;

	function getResults() {
		$inclick_client_id = $this->inClickClientId;
		$inclick_ad_channel = $this->inClickAdChannel;
		$inclick_ad_keyword = $this->inClickAdKeyword;
		$inclick_ad_count = $this->inClickAdCount;
		$inclick_xml_location = $this->inClickXMLLocation;
		$uip = $this->inClickUserIP;
		$ad_offset = $this->inClickOffset;
		$referring_page = $this->inClickReferringURL;
		$contextual = $this->inClickContextMatch;
		$sem_url = $this->inClickSEMURL;
		$referring_page = urlencode($referring_page);
		$sem_url = urlencode($sem_url);
		$inclick_ad_keyword = urlencode($inclick_ad_keyword);

		/* BUILD THE REQUEST */
		$post_data = "contextual=$contextual&p_id=$inclick_client_id&ad_count=$inclick_ad_count&doc_offset=$ad_offset&channel=$inclick_ad_channel&keyword=$inclick_ad_keyword&limited=1&uip=$uip&sem_url=$sem_url&referring_page=$referring_page";
		$url_location = "$inclick_xml_location/xml_feed.php?";

		/* GET THE REQUEST FROM THE AD SERVER */
		$get_html = new fetchXML();
		$get_html->user_agent = "inClick/3.0 (Ad Server Ad Requestor 1.0)";
		$get_html->url = $url_location . $post_data;
		$response_1 = $get_html->getWebsite();
		list($junk,$response_1) =  split("<\?xml", $response_1);
		$response_1 = "<?xml". $response_1;


		/* PARSE THE REQUEST SECTION*/
		$xml_text = $response_1;
		$xml2a      = new convertXMLToArray();
		$root_node  = $xml2a->parse($xml_text);
		$element_array     = array_shift($root_node["_ELEMENTS"]);
		$idx = 0;
		$ad_count = 0;
		while ($idx < "$inclick_ad_count"){
			if(strlen($element_array[_ELEMENTS][$idx][_ELEMENTS][0][_DATA]) > 5){
				$inresults['ad_heading'][$idx] = $element_array[_ELEMENTS][$idx][_ELEMENTS][0][_DATA];
				$inresults['ad_uri'][$idx] = $element_array[_ELEMENTS][$idx][_ELEMENTS][1][_DATA];
				$inresults['ad_cpc'][$idx] = sprintf("%01.2f",$element_array[_ELEMENTS][$idx][_ELEMENTS][2][_DATA]);
				$inresults['ad_desc'][$idx] = $element_array[_ELEMENTS][$idx][_ELEMENTS][3][_DATA];
				$inresults['ad_url'][$idx] = $element_array[_ELEMENTS][$idx][_ELEMENTS][4][_DATA];
				++$ad_count;
			}
			++$idx;
		}
		$inresults['adcount'] = $ad_count;
		return $inresults;
	}
	
	function generateSponsoredLinks(){
		$ads = $this->getResults();
		$sponsored_results = $ads['adcount'];
		$idx = 0;
		$sponsored_ad_premium = "";
		$sponsored_ad_standard = "";
		$sponsored_links['premium'] = "";
		$sponsored_links['standard'] = "";
		if($sponsored_results > 0) {
			//CREATE HTML WIDGET FOR FEATURED SPONSORS
			while($idx < 3 AND $idx < $sponsored_results){
				$sponsored_ad_premium .= ("
					<!-- results listing --> 
					  <a href=\"".$ads['ad_uri'][$idx] ."\" class=\"title\"><b>".$ads['ad_heading'][$idx] ."</b></a><br />
					  <div class=\"description\">
					  <span class=\"url\"> ". ereg_replace("http://","",$ads['ad_url'][$idx]) ."</span>&nbsp;&nbsp;&nbsp;&nbsp;".$ads['ad_desc'][$idx] ."</div>
					  <br>");
				++$idx;
			}
			$sponsored_ad_premium = substr($sponsored_ad_premium,0,-7);
			$sponsored_results_premium = ("
				<div id=\"sponsored_results_premium\">
				<div style=\"padding-right: 3px; float: right;\" align=\"right\"><font size=\"-1\"><b>Featured Sponsors</b> (<a href=advertise.php>more info</a>)</font></div>
					".$sponsored_ad_premium."
				</div>
				");

			//CREATE HTML WIDGET FOR THE STANDARD SPONSORED LINKS
			if($sponsored_results > 3 ){
				while($idx < $sponsored_results){
					$sponsored_ad_standard .= ("
					<!-- results listing --> 
				  <a href=\"".$ads['ad_uri'][$idx] ."\" class=\"title\">".$ads['ad_heading'][$idx] ."</a><br />
				  <div class=\"description\">
				    ".$ads['ad_desc'][$idx] ."</div>
				  <div class=\"sponsored_url\">
				    ". ereg_replace("http://","",$ads['ad_url'][$idx]) ."</div>
				  <br />");

					++$idx;
				}
				$sponsored_ad_standard = substr($sponsored_ad_standard,0,-7);

				$sponsored_results_standard = ("
				<div id=\"rightcolumn\">
					<div class=\"innertube\">
					 	<div id=\"sponsored_results_standard\">
					 	<b>Sponsored Links</b> (<a href=advertise.php>more info</a>) <hr size=\"1\">
						$sponsored_ad_standard
						<br><center><a href=advertise.php>Advertise here</center></a>
					    </div>
					</div>
				</div>
			");
			}
			$sponsored_links['premium'] = $sponsored_results_premium;
			$sponsored_links['standard'] = $sponsored_results_standard;
		}
		return $sponsored_links;
	}
}
//THE FOLLOWING CLASS, INRESULTS, IS COPYRIGHT 2001-2007 INMOTION GROUP
// AND IS DISTRIBUTED AS PART OF THE INCLICK AD SERVER.
class fetchXML{
	var $url;
	var $user_agent = "GetAds/1.0";
	var $proxy;
	var $proxy_port;

	function getWebsite()
	{
		$siteUrl = $this->url;
		$pageString;
		$fileDescriptor;
		$currLine;

		$pageString = "";

		$urlParts = parse_url($siteUrl);
		if (! array_key_exists("port", $urlParts))
		{
			$urlParts["port"] = 80;
		}
		ini_set('user_agent',"$this->user_agent");
		$sockDescriptor = @fsockopen($urlParts['host'], $urlParts['port'], $errorNumber, $errorValue, 5);
		if ($sockDescriptor)
		{
			$host = $urlParts["host"];
			$port = $urlParts["port"];
			$path = "/";
			if (array_key_exists("path", $urlParts))
			{
				$path = $urlParts["path"];
			}
			if (array_key_exists("query", $urlParts))
			{
				$path = $path . "?" . $urlParts["query"];
			}

			$out = "GET $path HTTP/1.1\r\n";
			$out .= "User-Agent: $this->user_agent\r\n";
			$out .= "Host: $host\r\n";
			$out .= "Connection: Close\r\n\r\n";

			fwrite($sockDescriptor, $out);
			$response = "";
			$response_code = 0;
			$loop_count = 0;

			while (!feof($sockDescriptor) && $response_code == 0 && $loop_count < 50)
			{
				$response_line = fread($sockDescriptor, 1000);
				$pageString .= $response_line;
				$loop_count++;
			}

			//Skip all header information if possible
			$tempString = strtolower($pageString);
			$position = strpos($tempString, "<html");
			if ($position !== false)
			{
				$pageString = substr($pageString, $position);
			}
			fclose($sockDescriptor);
		}
		else
		{
			return "127.0.0.1";
		}
		return $pageString;
	}
}


// THE FOLLOWING CLASS IS A MODIFICATION OF THE CODE CREATED BY DANTE LORENSO
// INFORMATION REGARDING THE FOLLOWING CLASS CAN BE FOUND AT
// http://www.phpbuilder.com/columns/lorenso20021221.php3?print_mode=1

class convertXMLToArray {
	var $parser;
	var $node_stack = array();

	function XMLToArray($xmlstring="") {
		if ($xmlstring) return($this->parse($xmlstring));
		return(true);
	}
	function parse($xmlstring="") {
		$this->parser = xml_parser_create();
		xml_set_object($this->parser, $this);
		xml_parser_set_option($this->parser, XML_OPTION_CASE_FOLDING, false);
		xml_set_element_handler($this->parser, "startElement", "endElement");
		xml_set_character_data_handler($this->parser, "characterData");
		$this->node_stack = array();
		$this->startElement(null, "root", array());
		xml_parse($this->parser, $xmlstring);
		xml_parser_free($this->parser);
		$rnode = array_pop($this->node_stack);
		return($rnode);
	}
	function startElement($parser, $name, $attrs) {
		// create a new node...
		$node = array();
		$node["_NAME"]      = $name;
		foreach ($attrs as $key => $value) {
			$node[$key] = $value;
		}
		$node["_DATA"]      = "";
		$node["_ELEMENTS"]  = array();

		// add the new node to the end of the node stack
		array_push($this->node_stack, $node);
	}
	function endElement($parser, $name) {
		// pop this element off the node stack
		$node = array_pop($this->node_stack);
		$node["_DATA"] = trim($node["_DATA"]);

		// and add it an an element of the last node in the stack...
		$lastnode = count($this->node_stack);
		array_push($this->node_stack[$lastnode-1]["_ELEMENTS"], $node);
	}
	function characterData($parser, $data) {
		// add this data to the last node in the stack...
		$lastnode = count($this->node_stack);
		$this->node_stack[$lastnode-1]["_DATA"] .= $data;
	}
}

------------------------

2. Add the following into your search results page after "extract($search_results);"

    /* ADDED FOR SPONSORED RESULTS */

    //GET SPONSORED RESULTS
    if($search_results['num_of_results'] > 0){
	
	//DETERMINE HOW MANY ADS TO GET
	if($search_results['num_of_results'] < 6){
		$ad_count = $search_results['num_of_results'] + 2;
	} else {
		$ad_count = 10;
	}
	
	//DETERMINE THE OFFSET
	if(array_key_exists('start',$_REQUEST)){
		$ad_offset = ($_REQUEST['start'] * 10)-10;
	} else {
		$ad_offset = 0;
	}
	
	//BUILD THE REQUEST FOR ADS FROM THE AD SERVER CLASS
	$ads = array();
	$getAds = new inResults();
	$referring_page = "http://".$_SERVER["SERVER_NAME"] . $_SERVER["REQUEST_URI"];
	$ad_query = ereg_replace("\"","",$query);
	$getAds->inClickAdChannel = 1; // THE CHANNEL TO DRAW ADS FROM
	$getAds->inClickClientId = $p_id; //THE PUBLISHER ID
	$getAds->inClickAdKeyword = $ad_query; //THE KEYWORD
	$getAds->inClickAdCount = $ad_count;  //THE NUMBER OF ADS TO RETRIEVE
	$getAds->inClickXMLLocation = "http://www.pettraffic.com/network/ads"; //THE LOCATION OF THE XML FEED
	$getAds->inClickUserIP = $_ENV['REMOTE_ADDR']; //THE IP ADDRESS OF THE SEARCHER
	$getAds->inClickAdOffset = $ad_offset; //THE OFFSET COUNT
	$getAds->inClickReferringURL = $referring_page; //THE URL OF THE PAGE WE ARE ON
	$getAds->inClickLimited = 1; //SET TO LIMITED TO ONLY SHOW MATCHES RESULTS (1)
	
	//GET ADS FROM AD SERVER, POPULATED AS A MULTI-DIMENSIONAL ARRAY
	//$ads = $getAds->getResults(); //GET THE ADS FROM THE AD SERVER
	
	$sponsored_links = $getAds->generateSponsoredLinks(); //GETS ADS FROM AD SERVER AND GENERATES HTML CODE FRAGMENTS
	
    }
    //END OF SPONSORED RESULTS

------------------------
3. Add the following into the results page to display paid ads:
	Look for the following line:
		<div id="results">
	Immediately after the above line, add the following two lines:
		<?php echo $sponsored_links['premium'];?>
		<?php echo $sponsored_links['standard'];?>

------------------------

4. Add the following to your CSS file:
	#sponsored_results_premium{
		width: 100%;
		/*height: 500px;*/
		font-family: Arial, Tahoma, Verdana, Helvetica, san-serif; 
		font-size: 12px;
		color: #000000;
		background-color: #FFF4DF;
		float: left;
		text-align:left;
		padding:6px;
	}
	#sponsored_results_standard{
		width: 190px;
		/*height: 500px;*/
		font-family: Arial, Tahoma, Verdana, Helvetica, san-serif; 
		font-size: 12px;
		color: #000000;
		/*background-color: #FFFFFF;*/
		float: right;
		text-align:left;
		padding:2px;
	}
	.sponsored_url {
		color: #115599;
		overflow:hidden;
		width:190px;
	} 

------------------------
When all is said and done, ads will appear when matches exist.  Of course, the behavior also depends on how you setup the ad server, but the above will be a good start.  This text didn't format quite the way I expected, so if you want the text document, just ping me.

Let me know if you have any questions!

-Bing