Skip to content

mkapalka/java-http-server

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Simple Java HTTP 1.1 server

Simple HTTP 1.1 server for static content. It exposes files (text, HTML, images, etc.) in a given directory (or multiple directories) as HTTP resources that can be downloaded via GET requests. It supports HEAD requests and connection keep-alive. The Java code provides a generic framework that can be used to add additional functionalities.

The server can be built and run as follows:

mvn verify
./run-server . localhost 3333

This will start the server on port 3333, bound on localhost, and serving files from the current directory. Example usage:

curl -v http://localhost:3333/README.md

Security note: the server prevents path traversal outside the specified base directory. However, symbolic links under the base directory are followed even if the link target is outside.

Design and implementation choices

Following the exercise description, the server is implemented using a thread pool, with each request being processed synchronously by a single thread. This is a simple and classical design with limitations that are discussed later in this document. Currently, the thread pool has a fixed number of threads, but a different pooling policy could easily be added.

To avoid blocking the threads in the pool for too long, the server implements a strict timeout policy: it limits the time to receive each request (the first request and subsequent requests on the same connection), the time to send back the response to the client, and the number of requests that can be sent on a single connection (using keep-alive).

The server code was written from scratch: no code was copied from other projects or generated by AI. The main code does not use any external libraries or frameworks, which means that it contains some boilerplate code that would typically be provided by common open-source libraries or generated, e.g., by Lombok (some of this trivial code was generated by the IDE). Also, the Java Logging API is used for logging; Logback or Log4j would be better but require an external dependency.

The test code uses standard testing libraries: JUnit, AssertJ and Mockito. Unit and integration tests are included in the repository. Additional manual testing was done using curl and netcat, in particular to test that the various timeout mechanisms work correctly. (It is possible to cover those test cases with automated integration tests; however, this would require more time than what was available.)

No load or performance testing was performed. This would be a required step for production code.

Code structure

Class HttpServer contains the top-level code of the server. An example of how this class can be configured and used can be found in class Main.

Sub-packages of package eu.kapalka.http are divided according to the HTTP request lifecycle:

  • request: implements HTTP request parsing and validation,
  • handler: handles HTTP requests by providing the corresponding HTTP response,
  • repository: implements a file-based repository for static content, and
  • response: formats HTTP responses that are to be sent back to the client.

Unit and integration tests are together in src/test/java.

The internal error handling follows those guiding principles:

  • Exceptions of type IOException are propagated all the way to the top-level server code whenever the best way to handle them is to immediately close the client connection.
  • For errors that require special handling and that are reported by public methods, I followed a more functional approach: see classes RequestParser and StaticFileRepository. (Method readToken of LineReader is an exception: we follow here the standard convention for Java I/O classes and return null on end-of-line.)
  • For error handling within a class (private methods), I used whatever was the most convenient (but there is always room for improvement).

Current limitations and improvement ideas

There are countless features that could be added to this HTTP server, e.g., a handler for update requests that would enable using this server for a simple REST API. Such a handler could also be file-based: use files to implement "blob storage" where HTTP GET, PUT and DELETE methods translate almost directly to the corresponding file system operations. Care would be needed to provide the necessary security (e.g., by using UUIDs or hashes of request URIs as file names instead of the URI paths), atomicity (e.g., by using atomic file rename for PUT operations) and durability (e.g., by careful usage of fsync).

In terms of compliance with HTTP 1.1 and related web standards, some important features are not implemented, in particular:

  • chunked transfer encoding,
  • HTTP method OPTIONS,
  • caching, and
  • authentication.

The design where each request is processed entirely (and synchronously) by a thread pool thread has the known limitation of limiting concurrency (the number of requests that can be handled in parallel) and the overall throughput. It also makes it easier for an attacker to execute a denial-of-service attack against the server by submitting a large number of concurrent requests, hence blocking all threads in the pool. The easiest way to overcome this limitation is to use virtual threads: to simply spawn a new virtual thread for each request (with a mechanism to limit the number of active virtual threads, e.g., using a semaphore). A big advantage of this solution is that it allows preserving the code simplicity and readability of the current solution. It should also offer good performance with a big number of concurrent clients. A potential limitation of this solution could come from the fact (maybe not true anymore in the newest Java version) that virtual threads do not get unmounted from their carrier threads on blocking file I/O operations. This could limit the scalability of our server given that we stream data from the file system to serve requests.

Another option to improve the throughput of the server would be to use asynchronous I/O operations, which would allow handling multiple concurrent requests in the same thread. The cost of this solution is greater code complexity. Whether it provides a significant performance advantage over the solution with virtual threads would need to be evaluated experimentally.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages