Skip to content

42 school project. Create a webserver in c++ able to handle HTTP/1.1 requests following a specific configuration file.

Notifications You must be signed in to change notification settings

artainmo/webserv

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

webserv

Coding a webserver in c++ able to handle HTTP/1.1 requests following a specific configuration file.

Subject: Webserv - 42

Members: 🌜 Artainmo - Ymanzi 🌛

myImage

Resources

myImage

Server setup

HTTP1.1 RFC

HTTP Header Syntax

Select / non-blocking fd / Signals

CGI

myImage

Notes

Understand subject

World Wide Web (www), is a grouping of all documents on the web accessible through URLs (uniform resource locator).

A server is a computer connected to the www, it contains an IP address to identify itself, this IP address can be associated with an URL which is a non-random name for easier identification.

Web browser, is a software application that allows access to the world wide web via URLs. The web browser transforms the URL into an IP address using the DNS (domain name system), it sends http requests towards that ip address or the server machine for the requested content.

Webservers serve an application/platform/website on the web. They receive web browser requests and send back responses. They often run on an external server, which is a computer with no visual display, waiting for client-servers to make requests.

Webserv project consists of writing an http web server. Meaning the web server is able to take and answer http requests.
A real webserver thus at least contains HTML files and an http server. But it can also contain a database to provide dynamic content, meaning the html files take variables from the database to produce dynamic content.

http requests are made by a web browser and consist of different parts, they can request content from the webserver or even post content on the webserver, allowing the client to interact with the webserver. http stands for hypertext transfer protocol, hypertext are documents that include links to other documents.

An http server is able to accept connections with a client, to be able to receive and answer http requests.

A web platform consists of a back-end and front-end.
Back-end acts as a web-server (answer HTTP request) and data-base-manager.
Front-end refers to the pages send by the web-server to the client-server, that the client's web browser will visually display, creating a user interface (UI).
Different programming languages can be used for the back-end and front-end development.

HTTP

The language used between servers to communicate.
http requests are made by a web browser and follow a specific syntax, they can request content from the webserver or even post content on the webserver, allowing the client to interact with the webserver.
http responses are made by the web server and follow a specific syntax, they can send HTML pages for the web browser to display.
http stands for hypertext transfer protocol, hypertext are documents that include links to other documents.

Structure of a request:
First line: method + request targer (URL) + HTTP version (HTTP/1.1)
Header fields: Optional extra information about request
Separation empty space
Body: Content of the request

Structure of a response:
Status line: HTTP version (HTTP/1.1) + status code + status short description
Header fields: Optional extra information about response
Separation empty space
Body: Content of the response

The different HTTP methods for requests are:
GET: requests representation of specified resource, specified resource is indicated by URL.
HEAD: asks for a response identical to that of a GET request, but without the response body, is used to verify if any erros.
POST: used to submit an entity to the specified resource, often causing a change in state on the server.
PUT: replaces all current representations of the specified resource with the submitted request entity.
DELETE: deletes the specified resource.

The different status codes for responses are:
200: OK, successful request
201: Created, for put and post methods
400: Bad request, invalid syntax
404: Not found, the URL is not recognized
405: Not allowed, the resource does not accept the method
500: Internal server error, situation the server does not know how to handle

Example http request:

std::string http_request_example = "GET / HTTP/1.1\r\n"
			"Accept-Charsets: dwdwdq dwqdwqdwqdwq\r\n"
      "Accept-Language: en-us\r\n"
			"Allow: xwdwdw dwqdwqdwq\r\n"
			"Authorization: dwdwqd dwqdwqdwq\r\n"
			"Content-Language: dfeefe efewf\r\n"
			"Content-Length: dwdwdw wdwdqw\r\n"
			"Content-Location: sdwdwqdwdwd\r\n"
			"Content-Type: ddw dwqdwqd\r\n"
			"Date: 09-9w2-1w1\r\n"
      "Host: localhost:8080\r\n"
			"Last-Modified: 424443 3424324\r\n"
			"Location: c;d';e\r\n"
			"Referer: dwwdwqd dwqdqwdwq\r\n"
			"Retry-After: dwdqwdw wdwqdqwdw\r\n"
			"Server: dwdwdwdwdwdwd\r\n"
			"Transfer-Encoding: fdsfsf34343\r\n"
      "User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_3) AppleWebKit/604.5.6 (KHTML, like Gecko) Version/11.0.3 Safari/604.5.6\r\n"
			"WWW_Authenticate: 2323wwdwdw\r\n"
      "\r\n"
      "Usually GET requests don\'t have a body\r\n"
      "But I don\'t care in this case :)";

New functions allowed in this project

stat - Information about a file
lstat - Information about a symbolic link
fstat - Same as stat but takes fd as argument

mkdir - creates a new directory
opendir - creates a directory stream
readdir - allowd to read the directory stream
closedir - closes the directory stream

gettimeofday - number of seconds and miniseconds since epoch
strptime - transforms time string in numerical structure

Socket programming

Socket programming allows two sockets (IP address with specific port) to communicate. The server contains the listen socket, and client, the client socket.

Create Socket
socket - used to create a socket
setsockopt - helps manipulate the socket for re-use
bind - binds the socket to a specific ip address and port number

Server Socket
listen - listen socket waits for the client socket/sockets to connect
accept - extract first connection request in queue and returns newly connected socket

Client Socket
connect - client socket demands connection with server socket

Listen to multiple clients with one thread using select
select - Allows the webserver to handle multiple clients. It waits until one of fds become active. Also allows for selection between TCP and UDP socket, http server normally uses TCP

Other
send - allows sending a message to another socket
recv - used to reveive a message from a socket
inet_addr - Used to convert an IPv4 address string notation into an int notation
getsockname - returns address of socket
fcntl - funcion able to perform a range of different commands on file descriptors, here can only be used to set a fd as non-blocking

POST sets received body as env var, that can be used by the URI that is called with cgi or not to produce an upload file. The upload file should be returned in body.
*Set env var
*Call upload file with cgi (or not)
*Produced upload file should be returned in body

Parsing body

If Transfer-encoding header is set to chunked, the content will be sent in multiple requests.
If header-lenght is specified, the content will be send in one request and the lenght of it is specified.

CGI

When sending content to a browser it has to be in HTML.
If file is not written in HTML it can be transformed to HTML using CGI, an interface that enables to execute an external program that transfomrs the non-HTML file to HTML.

Server host name and port

Port 1024 and above are not privileged and free to use and will not fail at binding. localhost can bind to ports under 1024.
Not all host addresses can be assigned, locally only network interfaces can be used. All available network addresses can be found using ifconfig -a, to parse the useful information use : ifconfig -a | grep "inet ".

Steps

-Take file path as argument
-Parse the file
-Socket connection / routing
-Parse client socket messages
--method
--URI / URL
--protocol version (IPv4 | IPv6)
--Header fields
---Request modifiers
---client info and representation
---metadata
--empty line to indicate end of header
--Message body containing payload body
-http answers
--status line protocol version
--success or error code
--textual reason phrase
--header fields
---server information
---resource metadata
---representational metadata
---Empty line to indicate end of the header
---Message body that contains the payload body

RFC
-Message Syntax And Routing
-Semantics And Content
-Conditional Requests
-Range Requests
-Caching
-Authentification

VERIF
-Check leaks, hangs, select, .....

Links

How to make a basic Tcp server Video
How to make a basic http server video
HTTP request parsing https

NGINX workings

Git collaboration

Never work on same files
Work on separate branches:
git branch <branch name> -> create branch
git checkout <branch name> -> go on branch
git push -u origin <branch name> -> push on your branch
git merge <branch name> -> from main branch, merge a sub-branch
git branch -d <branch name> -> delete your branch

About

42 school project. Create a webserver in c++ able to handle HTTP/1.1 requests following a specific configuration file.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published