Building REST API for Legacy PHP Projects
Every once in a while PHP developers are charged with tasks that require them to extend the functionalities of legacy projects, a task that often includes building REST APIs. Building a REST API for PHP-based projects is challenging, but in the absence of proper frameworks and tools, it can also be a particularly difficult goal to get right. In this article, Toptal engineer Arminas Zukauskas shares his advice, with sample code, on how to build a modern structured REST API around existing legacy PHP projects.
Every once in a while PHP developers are charged with tasks that require them to extend the functionalities of legacy projects, a task that often includes building REST APIs. Building a REST API for PHP-based projects is challenging, but in the absence of proper frameworks and tools, it can also be a particularly difficult goal to get right. In this article, Toptal engineer Arminas Zukauskas shares his advice, with sample code, on how to build a modern structured REST API around existing legacy PHP projects.
Arminas uses IT skills making business efficient. Worked on Toptal Core. His expertise includes NodeJS, PHP, MySQL and analytical thinking.
Expertise
Building or architecting a REST API is not an easy task, especially when you have to do it for legacy PHP projects. There are a lot of 3rd party libraries nowadays that make it easy to implement a REST API, but integrating them into existing legacy codebases can be rather daunting. And, you don’t always have the luxury to work with modern frameworks, such as Laravel and Symfony. With legacy PHP projects, you can often find yourself somewhere in the middle of deprecated in-house frameworks, running on top of old versions of PHP.
In this article, we will take a look at some common challenges of trying to implement REST APIs from scratch, a few ways to work around those issues and an overall strategy for building custom PHP based API servers for legacy projects.
Although the article is based on PHP 5.3 and above, the core concepts are valid for all versions of PHP beyond version 5.0, and can even be applied to non-PHP projects. Here, we will not cover what a REST API is in general, so if you’re not familiar with it be sure to read about it first.
To make it easy for you to follow along, here is a list of some terms used throughout this article and and their meanings:
- API server: main REST application serving the API, in this case, written in PHP.
- API endpoint: a backend “method” which the client communicates with to perform an action and produce results.
- API endpoint URL: URL through which the backend system is accessible to the world.
- API token: a unique identifier passed via HTTP headers or cookies from which the user can be identified.
- App: client application which will communicate with the REST application via API endpoints. In this article we will assume that it is web based (either desktop or mobile), and so it is written in JavaScript.
Initial Steps
Path Patterns
One of the very first things that we need to decide is at what URL path the API endpoints will be available. There are 2 popular ways:
- Create a new subdomain, such as api.example.com.
- Create a path, such as example.com/api.
At a glance, it may seem that the first variant is more popular and attractive. In reality, however, if you’re building a project-specific API, it could be more appropriate to choose the second variant.
One of the most important reasons behind taking the second approach is that this allows cookies to be used as a means to transfer credentials. Browser based clients will automatically send appropriate cookies within XHR requests, eliminating the need of an additional authorization header.
Another important reason is that you don’t need to do anything regarding subdomain configuration or management problems where custom headers may be stripped by some proxy servers. This can be a tedious ordeal in legacy projects.
Using cookies can be considered an “unRESTful” practice as REST requests should be stateless. In this case we can make a compromise and pass the token value in a cookie instead of passing it via a custom header. Effectively we are using cookies as just a way to pass the token value instead of the session_id directly. This approach could be considered stateless, but we can leave it up to your preferences.
API endpoint URLs can also be versioned. Additionally, they can include the expected response format as an extension in the path name. Although these are not critical, especially during the early stages of API development, in the long run these details can certainly pay off. Especially when you need to implement new features. By checking which version the client is expecting and providing the needed format for backwards compatibility can be the best solution.
The API endpoint URL structure could look as follows:
example.com/api/${version_code}/${actual_request_path}.${format}
And, a real example:
example.com/api/v1.0/records.json
Routing
After choosing a base URL for the API endpoints, the next thing we need to do is to think about our routing system. It could be integrated into an existing framework, but if that is too cumbersome, a potential workaround is to create a folder named “api” in the document root. That way the API can have completely separate logic. You can extend this approach by placing the API logic into its own files, such as this:
You can think of “www/api/Apis/Users.php” as a separate “controller” for a particular API endpoint. It would be great to reuse implementations from the existing codebase, for example reuse models that are already implemented in the project to communicate with the database.
Finally, make sure to point all incoming requests from “/api/*” to “/api/index.php”. This can be done by changing your web server configuration.
API Class
Version and Format
You should always clearly define what versions and formats your API endpoints accept and what are the default ones. This will allow you to build new features in the future while maintaining old functionalities. API version can basically be a string but you can use numeric values for better understanding and comparability. It is good to have spare digits for minor versions because it would clearly indicate that only few things are different:
- v1.0 would mean first version.
- v1.1 first version with some minor changes.
- v2.0 would be a completely new version.
Format can be anything your client needs including but not limited to JSON, XML, and even CSV. By providing it via URL as a file extension, the API endpoint url ensures readability and it becomes a no-brainer for the API consumer to know what format they can expect:
- “/api/v1.0/records.json” would return JSON array of records
- “/api/v1.0/records.xml” would return XML file of records
It’s worth pointing out that you will also need to send proper a Content-Type header in the response for each of these formats.
Upon receiving an incoming request, one of the first things you should do is check whether the API server supports the requested version and format. In your main method, which handles the incoming request, parse $_SERVER[‘PATH_INFO’] or $_SERVER[‘REQUEST_URI’] to determine if the requested format and version are supported. Then, either continue or return a 4xx response (e.g. 406 “Not Acceptable”). The most critical part in here is to always return something that the client expects. An alternative to this would be to check request header “Accept” instead of the URL path extension.
Allowed Routes
You could forward everything transparently to your API controllers but it might be better to use a whitelisted set of allowed routes. This would reduce flexibility a bit but will provide very clear insight of what the API endpoint URLs look like the next time you return to the code.
private $public_routes = array(
'system' => array(
'regex' => 'system',
),
'records' => array(
'regex' => 'records(?:/?([0-9]+)?)',
),
);
You could also move these to separate files to make things cleaner. The configuration above will be used to enable requests to these URLs:
/api/v1.0/system.json
/api/v1.0/records.json
/api/v1.0/records/7.json
Handling PUT Data
PHP automatically handles incoming POST data and places it under $_POST superglobal. However, that is not the case with PUT requests. All the data is “buried” into php://input. Do not forget to parse it into a separate structure or array before invoking the actual API method. A simple parse_str could be enough, but if the client is sending multipart request additional parsing may be needed to handle form boundaries. Typical use case of multipart requests include file uploads. Detecting and handling multipart requests can be done as follows:
self::$input = file_get_contents('php://input');
// For PUT/DELETE there is input data instead of request variables
if (!empty(self::$input)) {
preg_match('/boundary=(.*)$/', $_SERVER['CONTENT_TYPE'], $matches);
if (isset($matches[1]) && strpos(self::$input, $matches[1]) !== false) {
$this->parse_raw_request(self::$input, self::$input_data);
} else {
parse_str(self::$input, self::$input_data);
}
}
Here, parse_raw_request could be implemented as:
/**
* Helper method to parse raw requests
*/
private function parse_raw_request($input, &$a_data)
{
// grab multipart boundary from content type header
preg_match('/boundary=(.*)$/', $_SERVER['CONTENT_TYPE'], $matches);
$boundary = $matches[1];
// split content by boundary and get rid of last -- element
$a_blocks = preg_split("/-+$boundary/", $input);
array_pop($a_blocks);
// loop data blocks
foreach ($a_blocks as $id => $block) {
if (empty($block)) {
continue;
}
// parse uploaded files
if (strpos($block, 'application/octet-stream') !== false) {
// match "name", then everything after "stream" (optional) except for prepending newlines
preg_match("/name=\"([^\"]*)\".*stream[\n|\r]+([^\n\r].*)?$/s", $block, $matches);
// parse all other fields
} else {
// match "name" and optional value in between newline sequences
preg_match('/name=\"([^\"]*)\"[\n|\r]+([^\n\r].*)?\r$/s', $block, $matches);
}
$a_data[$matches[1]] = $matches[2];
}
}
With this, we can have the necessary request payload at Api::$input as raw input and Api::$input_data as an associative array.
Faking PUT/DELETE
Sometimes you can see yourself in a situation where the server does not support anything besides standard GET/POST HTTP methods. A common solution to this problem is to “fake” PUT/DELETE or any other custom request method. For that you can use a “magic” parameter, such as “_method”. If you see it in your $_REQUEST array, simply assume that the request is of specified type. Modern frameworks like Laravel have such functionality built into them. It provides great compatibility in case your server or client has limitations (for example a person is using his job’s Wi-Fi network behind corporate proxy that does not allow PUT requests.)
Forwarding to Specific API
If you don’t have the luxury of reusing existing project autoloaders, you can create your own with the help of spl_autoload_register function. Define it in your “api/index.php” page and call your API class located in “api/Api.php”. The API class acts as a middleware and calls the actual method. For example, a request to “/api/v1.0/records/7.json” should end up invoking “Apis/Records.php” GET method with parameter 7. This would ensure separation of concerns and provide a way to keep the logic cleaner. Of course, if it is possible to integrate this deeper into the framework you are using and reuse its specific controllers or routes you should consider that possibility too.
Example “api/index.php” with primitive autoloader:
<?php
// Let's define very primitive autoloader
spl_autoload_register(function($classname){
$classname = str_replace('Api_', 'Apis/', $classname);
if (file_exists(__DIR__.'/'.$classname.'.php')) {
require __DIR__.'/'.$classname.'.php';
}
});
// Our main method to handle request
Api::serve();
This will load our Api class and start serving it independently of the main project.
OPTIONS Requests
When a client uses custom header to forward its unique token, the browser first needs to check whenever the server supports that header. That’s where OPTIONS request come in. Its purpose is to ensure that everything is alright and safe for both the client and API server. So OPTIONS request could be firing every time a client tries to do anything. However, when a client is using cookies for credentials it saves the browser from having to send this additional OPTIONS request.
If a client is requesting for POST /users/8.json with cookies, its request will be pretty standard:
- App performs a POST request to /users/8.json.
- The browser carries out the request and receives a response.
But with custom authorization or token header:
- App performs a POST request to /users/8.json.
- The browser stops processing the request and initiates an OPTIONS request instead.
- OPTIONS request is sent to /users/8.json.
- The browser receives response with a list of all available methods and headers, as defined by the API.
- Browser continues with the original POST request only if the custom header is present in the list of available headers.
However, keep in mind that even when using cookies, with PUT/DELETE you might still receive that additional OPTIONS request. So be prepared to respond to it.
Records API
Basic Structure
Our example Records API is pretty straightforward. It will contain all the request methods and return output back to the same main API class. For example:
<?php
class Api_Records
{
public function __construct()
{
// In here you could initialize some shared logic between this API and rest of the project
}
/**
* Get individual record or records list
*/
public function get($id = null)
{
if ($id) {
return $this->getRecord(intval($id));
} else {
return $this->getRecords();
}
}
/**
* Update record
*/
public function put($record_id = null)
{
// In real world there would be call to model with validation and probably token checking
// Use Api::$input_data to update
return Api::responseOk('OK', array());
}
// ...
So defining each HTTP method will allow us to build API in REST style more easily.
Formatting Output
Naively responding with everything received from the database back to the client can have catastrophic consequences. In order to avoid any accidental exposure of data, create specific format method which would return only whitelisted keys.
Another benefit of whitelisted keys is that you can write documentation based on these and do all type-checkings ensuring, for example, that user_id will always be an integer, flag is_banned will always be boolean true or false, and date times will have one standard response format.
Outputting Results
Headers
Separate methods for headers output will ensure that everything sent to the browser is correct. This method can use the benefits of making the API accessible via same domain while still maintaining possibility to receive custom authorization header. The choice between the same or 3rd party domain can happen with the help of HTTP_ORIGIN and HTTP_REFERER server headers. If the app is detecting that client is using x-authorization (or any other custom header) it should allow access from all origins, allow the custom header. So it could look like this:
header('Access-Control-Allow-Origin: *');
header('Access-Control-Expose-Headers: x-authorization');
header('Access-Control-Allow-Headers: origin, content-type, accept, x-authorization');
header('X-Authorization: '.YOUR_TOKEN_HERE);
However if the client is using cookie-based credentials, headers could be a bit different, allowing only requested host and cookie related headers for credentials:
header('Access-Control-Allow-Origin: '.$origin);
header('Access-Control-Expose-Headers: set-cookie, cookie');
header('Access-Control-Allow-Headers: origin, content-type, accept, set-cookie, cookie');
// Allow cookie credentials because we're on the same domain
header('Access-Control-Allow-Credentials: true');
if (strtolower($_SERVER['REQUEST_METHOD']) != 'options') {
setcookie(TOKEN_COOKIE_NAME, YOUR_TOKEN_HERE, time()+86400*30, '/', '.'.$_SERVER['HTTP_HOST']);
}
Keep in mind that OPTIONS request does not support cookies so app will not send them with it. And, finally this allows all our wanted HTTP methods to have expiration of access control:
header('Access-Control-Allow-Methods: POST, GET, OPTIONS, PUT, DELETE');
header('Access-Control-Max-Age: 86400');
Body
The body itself should contain the response in a format requested by your client with a 2xx HTTP status upon success, 4xx status upon failure due to client and 5xx status upon failure due to server. Structure of the response can vary, although specifying “status” and “response” fields could be beneficial too. For example, if the client is trying to register a new user and the username is already taken it you could send a response with HTTP status 200 but a JSON in the body that looks something like:
{“status”: “ERROR”, “response”: ”username already taken”}
… instead of HTTP 4xx error directly.
Conclusion
No two projects are exactly the same. The strategy outlined in this article may or may not be be a good fit for your case, but the core concepts should be similar nonetheless. It’s worth noting that not every page can have the latest trending or up-to-date framework behind it, and sometimes the anger regarding “why my REST Symfony bundle doesn’t work here” can be turned into a motivation for building something useful, something that works. The end result may not be as shiny, as it will always be some custom and project-specific implementation, but at the end of the day the solution will be something that really works; and in a scenario like this that should be the goal of every API developer.
Example implementations of the concepts discussed here has been uploaded to a GitHub repository for convenience. You may not want to use these sample codes directly in production as they are, but this could easily work as a starting point for your next legacy PHP API integration project.
Had to implement a REST API server for some legacy project recently? Share your experience with us in the comment section below.
Kaunas, Kaunas County, Lithuania
Member since August 17, 2015
About the author
Arminas uses IT skills making business efficient. Worked on Toptal Core. His expertise includes NodeJS, PHP, MySQL and analytical thinking.