-
Notifications
You must be signed in to change notification settings - Fork 9
/
Copy pathtutorial.tex
1432 lines (1198 loc) · 62.2 KB
/
tutorial.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
\documentclass[11pt,oneside]{article}
%\documentclass[oneside,10pt]{book}
\title{ {\huge \sc Linear programming tutorial} }
\author{Ivan Savov}
\usepackage{amsthm,amsmath,amssymb,amsfonts,latexsym}
\usepackage{graphicx}
\usepackage{hyperref}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% STEP 1: Choose the true/false values for DRAFTMODE
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\usepackage{ifthen}
\newboolean{DRAFTMODE} % if PROOFREADING=true:
\setboolean{DRAFTMODE}{true} % 10pt, dblspaced, 8.5'' x 11'' paper
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% STEP 2: done!
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\input{00.main.hdr.tex}
%%%% EXERCISES and PROBLEMS %%%%%%%%%%%%%%%%%%%%%%%%%%
%\usepackage{answers}
\usepackage[nosolutionfiles]{answers} % use when PROOFREADING
\input{00.exercises_problems.hdr.tex}
\begin{document}
\maketitle
\vspace{-1cm}
\noindent
{ \center
git \gitrevisionnumber
}
\vspace{1cm}
\setcounter{tocdepth}{2}
\setcounter{secnumdepth}{2}
\tableofcontents
\vspace{1cm}
\section{Linear programming}
\label{applications:linear_programming}
In the early days of computing,
computers were primarily used to solve optimization problems
so the term ``programming'' is often used to describe optimization problems.
\emph{Linear programming} is the study of linear optimization problems that involve linear constraints.
% linear programming, quadratic programming, semidefinite programming.
Optimization problems play an important role in many business applications:
the whole point of a corporation is to constantly optimize profits,
subject to time, energy, and legal constraints.
% their operations within budgetary, time, and production constraints.
Suppose you want to maximize the quantity $g(x,y)$ subject to some constraints on the values $x$ and $y$.
To maximize $g(x,y)$ means to find the values of $x$ and $y$ that make $g(x,y)$ as large as possible.
Let's assume the \emph{objective function} $g(x,y)$ represents your company's revenue,
and the variables $x$ and $y$ correspond to monthly production rates of ``Xapper'' machines and ``Yapper'' machines.
You want to choose the production rates $x$ and $y$ that maximize revenue.
If the revenue from each Xapper machine is \$3000 and the revenue from each Yapper machine is \$2000,
the monthly revenue is described by the function $g(x,y) = 3000x + 2000y$.
% This is the objective function.
Due to the limitations of the current production facilities,
the rates $(x,y)$ are subject to various constraints.
We'll assume each constraint can be written in the form $a_1x+a_2y \leq b$.
%
The maximum number of Xapper machines that can be produced in a month is three, written $x\leq3$.
Similarly, the company can produce at most four Yapper machines, denoted $y \leq 4$.
Suppose it takes two employees to produce each Xapper machine and one employee to produce each Yapper machine.
If the company has a total of seven employees, % TODO: change phrase to remove conditional If...
the human resources limits impose the constraint $2x+y \leq 7$ on the production rates.
Finally, logistic constraints allow for at most five machines to be shipped each month, which we write as $x+y \leq 5$.
This production rate optimization problem can be expressed as the following \emph{linear program}:
\begin{align*}
\max_{x,y} g(x,y) \ &=\ 3000x \, + \, 2000y, \\ % \qquad [\$], \\
\intertext{subject to the constraints}
x \ &\leq \ 3, \\
y \ &\leq \ 4, \\
2x + y \ &\leq \ 7, \\
x + y \ &\leq \ 5, \\
x \geq 0, & \quad y \geq 0.
\end{align*}
Each of the inequalities represents one of the real-world production constraints.
We also included the non-negativity constraints $x \geq 0$ and $y \geq 0$ to show
it's impossible to produce a negative number of machines---we're not doing an accounting scam here,
this is a legit Xapper--Yapper business.
On first hand the problem looks deceptively simple.
We want to find the coordinates $(x,y)$ that maximize the objective function $g(x,y)$,
so we can simply find the direction of maximum growth of $g(x,y)$ and go as far as
possible in that direction.
Rather than attempt to plot the function $g(x,y)$ in three dimensions,
we can visualize the growth of $g(x,y)$ by drawing \emph{level curves} of the function,
which are analogous to the lines shown on topographic maps.
Each line represents some constant height: $g(x,y)=cn$, for $n \in \{0,1,2,3, \ldots \}$.
Figure~\ref{fig:linear_programming_level_curves_gxy} shows the level curves
of the objective function $g(x,y)=3000x + 2000y$ at intervals of $3000$.
\begin{figure}[htb]
\begin{center}
\includegraphics[width=0.6\textwidth]{figures/linear_algebra/linear_programming_level_curves_gxy.pdf}
%TODO: annotate each line with g value
\end{center}
\vspace{-6mm}
\caption{The objective function $g(x,y)=3000x + 2000y$ grows with $x$ and $y$.
The direction of maximum growth is $(3,2)$---if you think of $g(x,y)$
as describing the height of a terrain,
then the vector $(3,2)$ points uphill.
The dashed lines in the graph represent the following \emph{level curves}:
$g(x,y)=3000$, $g(x,y)=6000$, $g(x,y)=9000$, $g(x,y)=12000$, and so on.} % TODO: update here if nums go in fig
\label{fig:linear_programming_level_curves_gxy}
\end{figure}
Linear programming problems are mainly interesting because of the constraints imposed on the feasible region.
%
Each inequality corresponds to a restriction on the possible production rates $(x,y)$.
%
A coordinate pair $(x,y)$ that satisfies all constraints is called a \emph{feasible point}.
The \emph{feasible region} is the set of all feasible points.
We can represent the constraint region graphically by shading out parts of the $xy$-plane,
as show in Figure~\ref{fig:linear_programming_feasible_region}.
\begin{figure}[htb]
\begin{center}
\includegraphics[width=0.6\textwidth]{figures/linear_algebra/linear_programming_feasible_region.pdf}
% TODO: add x >0 and y >0 labels in bottom left corner
\end{center}
\vspace{-6mm}
\caption{The \emph{feasible region} for the linear programming problem.
The feasible region is the subset of the $xy$-plane that contains points $(x,y)$ satisfying all the constraints.
}
\label{fig:linear_programming_feasible_region}
\end{figure}
\noindent
\textbf{Which feasible point produces the maximum value of $g(x,y)$?}
This is the question we aim to answer in linear programming.
%
The linear programming problem illustrated in Figure~\ref{fig:linear_programming_feasible_region} is
simple enough that you can solve it by simply looking at the graph.
The highest level curve that touches the feasible region is $g(x,y)=12\,000$,
and the feasible point that lies on this level curve is $(2,3)$.
The solution to the optimization problem is:
\[
\max_{(x,y) \textrm{ feasible}} g(x,y) = 12\,000
\qquad
\textrm{and}
\qquad
\mathop{\textrm{argmax}}_{(x,y) \textrm{ feasible}} g(x,y) = (2,3).
\]
%
Real-life linear programming problems usually involve hundreds of variables,
so it's not possible simply to ``look'' at the constraint region and find the optimal solution.
% ---unless you can see surfaces and inequalities in 100-dimensions.
% Wow how much coffee did you drink to do that?
We need to develop a systematic approach---an algorithm---which doesn't
depend on our ability to visualize the geometry of high-dimensional surfaces.
\subsection{Simplex algorithm}
\label{applications:simplex_algorithm}
The \emph{simplex algorithm}, invented in 1947,
is a systematic procedure for finding optimal solutions to linear programming problems.
The main idea of the simplex algorithm is to start from one of the corner points of the
feasible region and ``move'' along the sides of the feasible region until we find the maximum.
The reason why this ``sticking to the sides'' strategy works is that maximum solutions to linear programming
problems always occur at the corners of the feasible region.
Therefore, we're sure to find the maximum if we visit all the corners.
Furthermore, in each iteration of the simplex algorithm we will always move along an edge where $g(x,y)$ increases.
Thus by moving from edge to edge in directions where $g(x,y)$ increases,
sooner or later we'll reach the corner point where $g(x,y)$ is maximum.
The steps of the simplex algorithm are as follows:
\begin{description}[style=unboxed]
\item[INPUTS:] Objective function $g(\vec{x})$ and constraints of the form $\vec{a}_i \cdot \vec{x} \leq b_i$.
\item[SETUP:] Construct the \emph{tableau}:
\begin{itemize}
\item Place constraint equations and slack variables in the first rows.
\item Place $-g(\vec{x})$ in the last row of the tableau.
\end{itemize}
\item[INITIALIZATION:] Start the algorithm from the coordinates $\vec{x}_0$ that correspond
to a vertex of the constraint region.
\item[MAIN LOOP:] Repeat while negative numbers in the last row of the tableau exist:
\begin{description}[style=unboxed]
\item[Choose pivot variable:] The pivot variable is the most negative number in the last row of the tableau.
\item[Choose pivot row:] The pivot row corresponds to the first constraint that will become active when the pivot variable increases.
\item[Move:] The algorithm ``moves'' from the vertex with coordinates $\vec{x}$ to the vertex with coordinates $\vec{x}^\prime$,
which is defined as a vertex where the pivot row is active.
\item[Change of variables:]
Perform the change of variables from the coordinates system $B$ with origin at $\vec{x}$
to a coordinates system $B^\prime$ with origin at $\vec{x}^\prime$.
This involves row operations on the tableau.
\end{description}
\item[OUTPUT:] When no more negative numbers in the last row exist,
we have reached an optimal solution $\vec{x}^* \equiv \mathop{\textrm{argmax}}_{\vec{x}} g(\vec{x})$.
\end{description}
\noindent
Though not exactly identical to the Gauss--Jordan elimination procedure,
the simplex algorithm is similar to it because it depends on the use of row operations.
For this reason, linear programming and the simplex algorithm are often forced upon
students taking a linear algebra course, especially business students.
I'm not going to lie to you and tell you the simplex algorithm is simple,
but it is very powerful so you should know it exists,
and develop a general intuition about how it works.
And, as with all things corporate-related,
it's worth learning about them so you'll know the techniques of the enemy.
In the remainder of this section, we'll go through the steps of the simplex algorithm
needed to find the optimal solution to the Xapper-and-Yapper production problem.
We'll analyze the problem from three different angles:
graphically by drawing shifted coordinate systems,
analytically by writing systems of equations,
and computationally by using a new matrix-like structure called a \emph{tableau}.
%which is useful for performing simplex algorithm calculations.
%
\subsubsection{Definitions}
We now introduce some useful terminology used in linear programming:
\begin{itemize}
\item An inequality $a_1x+a_2y \leq b$ is \emph{loose} (or \emph{slack}) for a given $(x,y)$
if there exists a positive constant $s >0$ such that $a_1x+a_2y +s = b$.
\item An inequality $a_1x+a_2y \leq b$ is \emph{tight} for a given $(x,y)$ if the equality conditions holds $a_1x+a_2y = b$.
\item $s$: \emph{slack variable}. Slack variables can be added to any inequality $a_1x+a_2y \leq b$
to transform it into an equality $a_1x+a_2y +s =b$. Note slack variables are always nonnegative $s \geq 0$.
\item A \emph{vertex} of an $n$-dimensional constraint region is a point where $n$ inequalities are tight.
\item An \emph{edge} of an $n$-dimensional constraint region is place where $n-1$ inequalities are tight.
\item A \emph{pivot variable} is a variable whose increase leads to an increase in the objective function.
\item A \emph{pivot row} is a row in the tableau that corresponds to the currently active edge during
a ``move'' operation of the simplex algorithm.
\item $\vec{v}$: a vector that represents the current \emph{state} of the linear program.
\item A \emph{tableau} is a matrix that represent the constraints on the feasible region
and the current value of the objective function.
Tableaus allow us to solve linear programming problems using row operations.
\end{itemize}
\subsubsection{Introducing slack variables}
In each step of the simplex algorithm,
we'll keep track of which constraints are \emph{active} (tight) and which are \emph{inactive} (loose).
%
To help with this bookkeeping task,
we introduce nonnegative a \emph{slack variable} $s_i \geq 0$ for each of the inequality constraints.
%
If $s_i > 0$, inequality $i$ is loose (not active),
and if $s_i=0$, inequality $i$ is tight (active).
If we want to remain within the feasible region,
no $s_i$ can become negative,
since $s_i< 0$ implies inequality $i$ is not satisfied.
Introducing the slack variables, the linear program becomes:
\begin{align*}
\qquad \qquad \max_{x,y} g(x,y) &= 3000x + 2000y, \\
\intertext{subject to the equality constraints:}
x \ \ \quad +s_1 \quad \quad \quad \ &= \ 3, \\
y \quad +s_2 \quad \quad \ &= \ 4, \\
2x + y \quad \quad + s_3 \quad \ &= \ 7, \\
x + y \quad \quad \quad + s_4 \ &= \ 5, \\
x,y,s_1,s_2,s_3,s_4 \ &\geq \ 0. \\
\end{align*}
\noindent
This is starting to look a lot like a system of linear equations, right?
Recall that what matters the most in systems of linear equations are the \emph{coefficients},
and not the variable names.
Previously,
when faced with overwhelming complexity in the form of linear equations with many variables,
we used an augmented matrix to help us focus only on what matters.
We'll use a similar approach again.
\subsubsection{Introducing the tableau}
A \emph{tableau} is a compact representation of a linear programming problem in the form of an array of numbers,
analogous to the augmented matrix used to solve systems of linear equations.
{\footnotesize
\[
\begin{array}{rl}
x \ \ \quad +s_1 \quad \quad \quad \ &= \ 3 \\
y \quad +s_2 \quad \quad \ &= \ 4 \\
2x + y \quad \quad + s_3 \quad \ &= \ 7 \\
x + y \quad \quad \quad + s_4 \ &= \ 5 \\
- 3000x - 2000y \quad \quad \ \ \, \ \ &= \ \underline{?}
\end{array}
\quad
\Leftrightarrow
\quad
\left[
\begin{array}{rrrrrr|r}
1& 0& 1& 0& 0& 0 \ \ & \ 3 \\
0& 1& 0& 1& 0& 0 \ \ & \ 4 \\
2& 1& 0& 0& 1& 0 \ \ & \ 7 \\
1& 1& 0& 0& 0& 1\ \ & \ 5 \\
\!\!\!-3000& \!\!\!\!-2000& 0& 0& 0& 0 \ \ & \ \underline{?}
\end{array}
\right]\!.
\]}%
\noindent
The first six columns of the tableau correspond to the variables $x$, $y$, $s_1$, $s_2$, $s_3$, $s_4$,
and the last column contains the constants from the constraint equations.
%
In the last row, we use the coefficients of the negative objective function $-g(x,y)$.
This strange initialization of the last row of the tableau is a trick we use to calculate the current value of the objective function $g(x,y)$
in the bottom right corner of the tableau (the entry that contains an underlined question mark in the above tableau).
We defer the explanation of why we use the \emph{negative} of the objective function
in the last row until after we learn about the interpretation of row operations on the tableau.
The numbers in the example problem are deliberately chosen to emphasize the distinction between
the constraint rows and the objective-function row.
Small coefficients are used for the constraints and large coefficients in the objective function.
% (the sides of the region) vs level curves
This way it should be clear that two different types of data are recorded in the tableau.
%
Looking back to Figure~\ref{fig:linear_programming_feasible_region}
can help you compare the two scales of the problem:
the linear constraints that delimit the feasible region all fit in a $6\times 6$ coordinate system,
while the distance between the level curves of $g(x,y)$ is $3000$.
%
You can think of the linear programming problem described in Figure~\ref{fig:linear_programming_feasible_region}
as a three dimensional plot of the plane $z=g(x,y)=3000x+2000y$,
restricted to the points that lie above the feasible region.
%
The scale of the $x$- and $y$-axes in such a plot would be much smaller than the scale of the $z$-axis.
% TODO: produce 3d plot of constraint region.
\subsubsection{Start from a vertex}
In each step of the simplex algorithm,
we want to move from one vertex of the constrain region to another vertex,
moving along one of the sides.
%
For the algorithm to do its thing,
it must start from one of the corners of the constraint region.
%
We can start from the origin $(0,0)$,
which is the vertex formed by the intersection of the nonnegativity constraints $x\geq 0$, and $y \geq 0$.
Given $x=0$ and $y=0$, we can deduce the values of the slack variables:
$s_1=3$, $s_2=4$, $s_3=7$, $s_4=5$.
In other words, all constraints are initially maximally slack.
\subsubsection{State vector}
The simplex algorithm requires two types of state.
First, we must record the coordinates $(x,y)$ of the current position (the current vertex being visited).
Second, we need to keep track which constraints are tight and which are slack.
Now this is where things get a little weird.
Instead of keeping track of these two types of information separately,
we'll use a six-dimensional \emph{state vector} that represents the current position \emph{in terms of} the variables in the tableau.
%
The state vector that corresponds to starting the simplex algorithm from $(x,y)=(0,0)$ is
\[
\vec{v} = (0,0, 1, 1, 1, 1 ).
\]
If the $i$\textsuperscript{th} entry of $\vec{v}$ is $1$ then the $i$\textsuperscript{th} column of the tableau is used in the current set of equations.
Otherwise if the $i$\textsuperscript{th} entry of $\vec{v}$ is $0$ then the $i$\textsuperscript{th} column of the tableau
should be ignored.\!\footnote{Readers familiar with programming will recognize $\vec{v}$ serves as a \emph{bitmask}.}
Note that each state vector is tied to a particular tableau and has no meaning on its own.
%
We can understand the select-only-columns-with-one-in-the-state procedure as a matrix multiplication:
{\footnotesize
\[
\left[
\begin{array}{rrrrrr}
1& 0& 1& 0& 0& 0 \\
0& 1& 0& 1& 0& 0 \\
2& 1& 0& 0& 1& 0 \\
1& 1& 0& 0& 0& 1 \\
\!\!\!-3000& \!\!\!\!-2000& 0& 0& 0& 0
\end{array}
\right]
\!\!
\begin{bmatrix}
0 \\ 0 \\ 1 \\ 1 \\ 1 \\ 1
\end{bmatrix}\!
=
\!
\begin{bmatrix}
3 \\ 4 \\ 7 \\ 5 \\ 0
\end{bmatrix}
\quad
\Rightarrow
\ \ \
\begin{array}{rl}
s_1 \quad \quad \quad \ &= 3 \\
\quad s_2 \quad \quad \ &= 4 \\
\quad \quad s_3 \quad \ &= 7 \\
\quad \quad \quad s_4 \ &= 5 \\
- 3000(0) - 2000(0) \!\!&= 0.
\end{array}
\]}%
The current problem variables have value $x=0$ and $y=0$,
and each of the constraint equations is maximally slack.
The value of the objective function at this vertex is $g(0,0)=0$.
I know what you're thinking.
Surely it would have been simpler to keep track of the position $(x,y)$
and the slackness separately $(s_1,s_2,s_3,s_4)$,
rather than to invent a binary state vector $\vec{v} \in \{0, 1\}^{6}$,
and then depend on matrix multiplication to find the coordinates.
%
% and thus operate on them simultaneously through row operations.
% You'll see things will become clear when we start with the row operations,
% but before we move on let's just look at Figure~\ref{fig:linear_programming_1} to see where we are.
True that.
But the energy we invested to represent the constraints as a tableau
will allow us to perform complex geometrical operations by performing row operations on the tableau.
%
The key benefit of the combined representation is that it treats
problem variables and slack variables on the same footing.
\begin{figure}[htb]
\begin{center}
\includegraphics[width=0.5\textwidth]{figures/linear_algebra/linear_programming_1.pdf}
\end{center}
\vspace{-6mm}
\caption{The $(x,y)$ coordinates are measured with respect to $(0,0)$.}
\label{fig:linear_programming_1}
\end{figure}
It's now time to start making progress.
Let's make a move. Let's cut some slack!
\subsubsection{Choose the pivot variable}
The simplex algorithm continues while there exist negative numbers in the last row of the tableau.
Recall we initialized the last row of the tableau with the coefficients of $-g(x,y)$
and the initial value to $g(0,0)=0$.
%
\[
\left[
\begin{array}{rrrrrr|r}
1& 0& 1& 0& 0& 0 \ \ & \ 3 \ \ \\
0& 1& 0& 1& 0& 0 \ \ & \ 4 \ \ \\
2& 1& 0& 0& 1& 0 \ \ & \ 7 \ \ \\
1& 1& 0& 0& 0& 1\ \ & \ 5 \ \ \\
-3000& -2000& 0& 0& 0& 0 \ \ & \ 0 \ \
\end{array}
\right].
\]
Both the $x$ and $y$ columns contain negative numbers.
This means the objective function would increase if were to increase either the $x$ or the $y$ variables.
%
The coefficient $-3000$ in the last row represents our \emph{incentives} towards increasing the variable $x$:
each increase of $1$ step in the Xapper production will result in an increase of $3000$ in the value of $g(x,y)$.
Similarly, the coefficient $-2000$ indicates that a unit-step in the $y$ direction will increase $g(x,y)$ by $2000$.
It's a bit complicated why we put the \emph{negatives} of the coefficients in the row for the objective function,
but you'll have a chance to convince yourself it all works out nicely when you see the row operations.
Given the options $x$ and $y$,
we choose to increase the $x$ variable since it leads to a biggest gain in profits.
Remember, we're playing the role of the greedy corporate capitalist whose only motivation is to maximize profits.
\subsubsection{Choose the pivot row}
We've decided to increase the $x$ variable.
The next step is to check how big we can make the $x$ variable before hitting one of the constraints.
%
% To find out we check each of the four constraints to find which will become active
% We say $x$ is the \emph{pivot variable}---the variable
Which constraint do we hit first when we move in the $x$ direction?
In the first equation $x + s_1 = 3$, we could increase $x$ up to $x=3$.
In the second equation $y + s_2 = 0 + s_2 = 4$, the $x$-variable doesn't appear so it ``allows'' an arbitrary increase in $x$.
The third constraint $2x + y + s_3 = 2x + 0 + s_3 = 7$ allows an increase in $x$ up to $x \leq 7/2=3.5$.
Finally, the fourth equation $x + 0+ s_4 = 5$ imposes the constraint $x\leq 5$.
The equation $x + s_1 = 3$ allows the smallest increase in $x$,
so this equation will become our \emph{pivot row}.
Specifically, pivoting through the equation $x + s_1 = 3$ means we're decreasing $s_1$ from $3$ to $0$
and correspondingly increasing $x$ from $0$ to $3$.
We have thue de-slacked the constraint $x + s_1 = 3$.
Performing the pivot operation corresponds to the following change in the state vector:
\[
\vec{v} = (0,0, 1, 1, 1, 1 )
\qquad \to
\qquad
\vec{v}^{\prime} = (1,0, 0, 1, 1, 1 ).
\]
To understand the effect of this change of state,
let's look at the constraint equations that result from the new state vector:
{\footnotesize
\[
\left[
\begin{array}{rrrrrr}
1& 0& 1& 0& 0& 0 \ \ \\
0& 1& 0& 1& 0& 0 \ \ \\
2& 1& 0& 0& 1& 0 \ \ \\
1& 1& 0& 0& 0& 1 \ \
% \\
% -3000& -2000& 0& 0& 0& 0 \ \
\end{array}
\right]
\!\!
\begin{bmatrix}
1 \\ 0 \\ 0 \\ 1 \\ 1 \\ 1
\end{bmatrix}
=
\begin{bmatrix}
3 \\ 4 \\ 7 \\ 5 \\ 0
\end{bmatrix}
\qquad
\Rightarrow
\qquad
\begin{array}{rl}
x \ + \ \ \ \quad \quad \quad \ &= \ 3 \\
\quad s_2 \quad \quad \ &= \ 4 \\
\quad \quad s_3 \quad \ &= \ 7 \\
\quad \quad \quad s_4 \ &= \ 5 \ .
% - 3000(0) - 2000(0) \quad \ &= \ \underline{0}
\end{array}
\]}%
The effect of the new state $\vec{v}^\prime$ is to increase $x$ from $0$ to $3$,
which leads to corresponding decrease of $s_1$ from $3$ to $0$.
The value of the six variables are now:
\[
x=3, \quad y=0, \quad s_1=0, \quad s_2=4, \quad s_3=7, \quad s_4=5.
\]
%
Note the change of state vector $\vec{v} \to \vec{v}^\prime$ we performed did not change the
system of equations that describe the constraints of the linear program:
\[
\begin{array}{rl}
x \ + \ \ \ \ \ + \ s_1 \quad \quad \quad \ &= \ 3 \\
y \ + \quad s_2 \quad \quad \ &= \ 4 \\
2x \ + \ y \ + \quad \quad s_3 \quad \ &= \ 7 \\
\ x \ + \ y \ + \quad \quad \quad s_4 \ &= \ 5.
\end{array}
\]
%
This system of equations corresponds to the tableau:
\[
\left[
\begin{array}{rrrrrr|r}
1& 0& 1& 0& 0& 0 \ \ & \ 3 \ \ \\
0& 1& 0& 1& 0& 0 \ \ & \ 4 \ \ \\
2& 1& 0& 0& 1& 0 \ \ & \ 7 \ \ \\
1& 1& 0& 0& 0& 1\ \ & \ 5 \ \ \\
-3000& -2000& 0& 0& 0& 0 \ \ & \ 0 \ \
\end{array}
\right].
\]
Except for the increase in $x$ and the corresponding decrease in $s_1$,
we haven't done anything to the tableau.
The next step of the simplex algorithm involves performing row operations on the tableau.
\subsubsection{Change of variables}
Knowing $x=3$, we can subtract this equation from the other constraints,
to remove the variable $x$ from the system of equations.
We can eliminate the variable $x$ from the system of equations
by performing the following row operations:
\begin{itemize}
\item $R_3 \gets R_3 - 2R_1$
\item $R_4 \gets R_4 - R_1$
\item $R_5 \gets R_5 + 3000R_1$
\end{itemize}
% >>> M.row(2& lambda v,j: v-2*M[0,j] )
% >>> M.row(3& lambda v,j: v-1*M[0,j] )
% >>> M.row(4& lambda v,j: v+3000*M[0,j] )
\noindent
The result is the following tableau:
\[
\left[
\begin{array}{rrrrrr|r}
1& 0& 1& 0& 0& 0& 3 \\
0& 1& 0& 1& 0& 0& 4 \\
0& 1& -2& 0& 1& 0& 1 \\
0& 1& -1& 0& 0& 1& 2 \\
0& -2000& 3000& 0& 0& 0& 9000
\end{array}
\right].
\]
Geometrically speaking,
the effect of the row operations is a change of coordinate system.
The equations in the new tableau are expressed with respect to the
coordinate system $(x',y')$ whose origin is at $(3,0)$.
Figure~\ref{fig:linear_programming_1} illustrates this change of coordinate system:
\begin{figure}[htb]
\begin{center}
\includegraphics[width=0.5\textwidth]{figures/linear_algebra/linear_programming_2.pdf}
\end{center}
\vspace{-6mm}
\caption{The origin of the original $(x,y)$ coordinates system is at $(0,0)$.
The origin of the new $(x',y')$ coordinates system is at $(3,0)$.}
\label{fig:linear_programming_1}
\end{figure}
The values of the six variables after performing the change of coordinate system are:
\[
x=3, \quad y'=0, \quad s'_1=0, \quad s'_2=4, \quad s'_3=1, \quad s'_4=2,
\]
and the current state of the tableau is
\vspace{1mm}
{\footnotesize
\[
\left[
\begin{array}{rrrrrr|r}
1& 0& 1& 0& 0& 0& 3 \\
0& 1& 0& 1& 0& 0& 4 \\
0& 1& -2& 0& 1& 0& 1 \\
0& 1& -1& 0& 0& 1& 2 \\
0& -2000& 3000& 0& 0& 0& 9000
\end{array}
\right]
\quad
\Leftrightarrow
\quad
\begin{array}{rl}
x \ + \ \ \ \ \ + \, \ \ s'_1 \quad \quad \quad \ &= \ 3 \\
y' \ + \quad \ \ s'_2 \quad \quad \ &= \ 4 \\
y' \ - \ 2s'_1 \quad s'_3 \quad \ &= \ 1 \\
y' \ - \, \ \ s'_1 \quad \quad s'_4 \ &= \ 2.
\end{array}
\]}%
Note $s'_1=0$ in the system of equations so we could completely ignore this variable and the associated column of the tableau.
Nevertheless, we choose to keep the $s'_1$ column around as a bookkeeping device,
because it's useful to keep track of how many times we have subtracted the first row from the other rows.
We have now completed the first step in of the simplex algorithm,
and we continue the procedure by looking for the next pivot.
\subsubsection{Choose the pivot}
Observe the second column of the row that corresponds to the objective function contains a negative number,
which means the objective function will increase if we increase the variable $y$.
% advantageous
Having decided to increase $y'$
we must now decide by what amount can we increase the variable $y'$ before we hit one of the constraints?
We can find this out by computing the ratios of the constants in the rightmost column of the tableau,
and the $y'$ coefficients of each row.
The first constraint does not contain $y'$ at all, so it allows any increase in $y'$.
The second constraint will become active when $y' = 4$.
The third constraint allows $y'$ to go up to $y'=1$,
and the fourth constraint becomes active when $y'=2$.
%>>> [ M[i,6]/M[i,1] for i in range(0,4) ]
%[oo& 4& 1& 2]
%
The largest increase in $y'$ that satisfies all the constrains is $y'=1$.
We'll use the third row of the tableau as the \emph{pivot row},
which means we de-slack $s'_3$ from $1$ to $0$ and correspondingly increasing $y'$ from $0$ to $1$.
After pivoting, the values of the six variables are
\[
x'=3, \quad y'=1, \quad s''_1=0, \quad s''_2=4, \quad s''_3=0, \quad s''_4=2.
\]
\subsubsection{Change of variables}
We can now subtract the equation $y'=1$ from the other $y'$-containing rows.
The required row operations are
\begin{itemize}
\item $R_2 \gets R_2 - R_3$
\item $R_4 \gets R_4 - R_3$
\item $R_5 \gets R_5 + 2000R_3$
\end{itemize}
%>>> M.row(1& lambda v,j: v-1*M[2,j] )
%>>> M.row(3& lambda v,j: v-1*M[2,j] )
%>>> M.row(4& lambda v,j: v+2000*M[2,j] )
After applying these row operations, the resulting tableau is
\[
\left[
\begin{array}{rrrrrr|r}
1& 0& 1& 0& 0& 0& 3 \\
0& 0& 2& 1& -1& 0& 3 \\
0& 1& -2& 0& 1& 0& 1 \\
0& 0& 1& 0& -1& 1& 1 \\
0& 0& -1000& 0& 2000& 0& 11000
\end{array}
\right].
\]
\noindent
The geometrical interpretation of subtracting the equation $y'=1$ from the other constraints
is a change to a new coordinate system $(x'',y'')$ with origin at $(3,1)$,
as illustrated in Figure~\ref{fig:linear_programming_3}.
\begin{figure}[htb]
\begin{center}
\includegraphics[width=0.5\textwidth]{figures/linear_algebra/linear_programming_3.pdf}
\end{center}
\vspace{-6mm}
\caption{The origin of the $(x'',y'')$ coordinate system is at $(3,1)$.}
\label{fig:linear_programming_3}
\end{figure}
\subsubsection{Choose a pivot}
The third column of the tableau contains a negative number,
which means the objective function will increase if we increase $s''_1$.
We pick $s''_1$ as the next pivot variable.
We now check the different constraints containing $s''_1$ to see which will become
active first when we increase $s''_1$.
The first inequality imposes $s''_1 \leq 3$,
the second imposes $s''_1 \leq 3/2=1.5$,
the third equation contains a negative amount of $s''_1$ and thus can't be used as a pivot.
The equation in the fourth row, $s''_1 -s''_3 + s'_4 = 1$,
imposes the constraint $s''_1 \leq 1$,
and is the tightest constraint.
Therefore, we'll use the fourth row equation as the pivot.
%>>> [ M[i,6]/M[i,2] for i in range(0,4) ]
%[3& 3/2& -1/2& 1]
After pivoting, the new values of the variables are
\[
x''=3, \quad y''=1, \quad s''_1=1, \quad s''_2=4, \quad s''_3=0, \quad s''_4=0.
\]
We now carry out the following row operations to eliminate $s''_1$ from the other eqations:
\begin{itemize}
\item $R_1 \gets R_1 - R_4$
\item $R_2 \gets R_2 - 2R_4$
\item $R_3 \gets R_3 + 2R_4$
\item $R_5 \gets R_5 + 1000R_4$
\end{itemize}
%>>> M.row(0& lambda v,j: v-1*M[3,j] )
%>>> M.row(1& lambda v,j: v-2*M[3,j] )
%>>> M.row(2& lambda v,j: v+2*M[3,j] )
%>>> M.row(4& lambda v,j: v+1000*M[3,j] )
\noindent
The final tableau is
\[
\left[
\begin{array}{rrrrrr|r}
1& 0& 0& 0& 1& -1& 2 \\
0& 0& 0& 1& 1& -2& 1 \\
0& 1& 0& 0& -1& 2& 3 \\
0& 0& 1& 0& -1& 1& 1 \\
0& 0& 0& 0& 1000& 1000& 12000
\end{array}
\right].
\]
We state variables that correspond to this tableau are
\[
x'''=2, \quad y'''=3, \quad s'''_1=1, \quad s'''_2=1, \quad s'''_3=0, \quad s'''_4=0.
\]
We know this tableau is \emph{final} because there are no more negative numbers in the last row.
This means there are no more directions we can move in that would increase the objective function.
%
\begin{figure}[htb]
\begin{center}
\includegraphics[width=0.5\textwidth]{figures/linear_algebra/linear_programming_4.pdf}
\end{center}
\vspace{-3mm}
\caption{The $(x''',y''')$ coordinates are measured with respect to the point $(2,3)$.
The simplex algorithm stops because it's not possible to increase any variable
in a direction to increase the objective function $g(x,y)$.}
\label{fig:linear_programming_4}
\end{figure}
\subsubsection{Regroup}
If you're finding linear programming and the simplex algorithm difficult to comprehend,
you're not alone.
It will take some time to understand the procedure,
but if you solve a few practice problems, you'll get the hang of it.
Linear programming problems with few parameters and few constraints are simple to solve,
but problems that involve hundreds of variables and hundreds of constraints can be very difficult solve.
Linear programming problems with hundreds and even thousands of variables are common in real-world scenarios.
For this reason, it is very important to approach linear programming problems
in a systematic manner so that we can teach a computer to do the steps for us.
In the next section we enlist the help of \texttt{SymPy} to retrace the steps of the simplex algorithm for the Xapper--Yapper linear program.
\subsection{Using \texttt{SymPy} to solve linear programming problems}
Recall each \texttt{SymPy} \texttt{Matrix} object has a method called \texttt{.row(...)} that
can be used to perform row operations on the matrix.
%
Let's see if we can't make linear programming a little bit more bearable by using \texttt{SymPy}
to perform the row operations on the tableau.
Letting \texttt{SymPy} do the tedious row operations work for us will
let us focus on the \emph{state} of the algorithm in each step.
Since \texttt{SymPy} will be doing the calculations for us,
it will not hurt if we include the nonnegativity constraints $x\geq0$ and $y\geq 0$ into our analysis.
% See Figure~\ref{fig:linear_programming_feasible_region} (page~\pageref{fig:linear_programming_feasible_region})
% for an illustration of the constraint region.
We rewrite these constraints as $-x \leq 0$ and $-y \leq 0$ in order to conform to the standard convention ($a_1x+a_2y \leq b$)
for expressing constraints.
The full system of constrains that correspond to the feasible region is
\[
\begin{array}{rl}
x \ \ \quad +s_1 \quad \quad \quad \quad \quad \ &= \ 3 \\
y \quad +s_2 \quad \quad \quad \quad \ &= \ 4 \\
2x + y \quad \quad + s_3 \quad \quad \quad \ &= \ 7 \\
x + y \quad \quad \quad + s_4 \quad \quad \ &= \ 5 \\
-x \ \ \ \quad \ \ \quad \quad \quad +s_5 \quad \ &= \ 0 \\
-y \quad \quad \quad \quad \quad + s_6 \ &= \ 0. \\
\end{array}
\]
\noindent
We can create a tableau as a \texttt{SymPy} matrix object whose coefficients
correspond to the above system of equations. Use the following command:
\begin{verbatim}
>>> M = Matrix([
[ 1, 0, 1, 0, 0, 0, 0, 0, 3]
[ 0, 1, 0, 1, 0, 0, 0, 0, 4]
[ 2, 1, 0, 0, 1, 0, 0, 0, 7]
[ 1, 1, 0, 0, 0, 1, 0, 0, 5]
[ -1, 0, 0, 0, 0, 0, 1, 0, 0]
[ 0, -1, 0, 0, 0, 0, 0, 1, 0]
[-3000,-2000, 0, 0, 0, 0, 0, 0, 0]] )
\end{verbatim}
\noindent
Note constructing the tableau did not require any special command---a tableau is just a regular \texttt{Matrix} object.
The first six rows of the matrix~$M$ correspond to the constraints on the feasible region,
while the last row contains the negatives of the coefficients of the objective function $-g(x,y)=-3000x-2000y$.
\subsubsection{Vertices and sides}
The simplex algorithm starts from a \emph{vertex} of the feasible region,
and moves along the \emph{edges} of the feasible region.
For the two-dimensional Xapper--Yapper production problem,
the vertices of the region are the corners,
and the edges are the sides of a two-dimensional \emph{polygon}.
A polygon is the general math term for describing a subset of the $xy$-plane
delimited by a finite chain of straight line segments.
More generally,
the constraint region of an $n$-dimensional linear programming problem has the shape of an $n$-dimensional \emph{polytope}.
A vertex of the polytope corresponds to a place where $n$ constraint inequalities are satisfied,
while an edge corresponds to the intersection of $n-1$ inequalities.
In each step of the simplex algorithm,
we'll keep track of which constraints in the tableau are \emph{active} (tight) and which are \emph{inactive} (loose).
The simplex algorithm must be started from one of the vertices of the rate region.
For the Xapper--Yapper production problem,
we initialized the tableau from the corner $(0,0)$,
which is the vertex where the two nonnegativity constraints intersect.
\begin{verbatim}
>>> M
[ 1, 0, 1, 0, 0, 0, 0, 0, 3]
[ 0, 1, 0, 1, 0, 0, 0, 0, 4]
[ 2, 1, 0, 0, 1, 0, 0, 0, 7]
[ 1, 1, 0, 0, 0, 1, 0, 0, 5]
[ -1, 0, 0, 0, 0, 0, 1, 0, 0] (active)
[ 0, -1, 0, 0, 0, 0, 0, 1, 0] (active)
[-3000, -2000, 0, 0, 0, 0, 0, 0, 0]
\end{verbatim}
\noindent
The current \emph{state} of the simplex algorithm is completely specified by the two \texttt{(active)} flags.
We can deduce the value of all state variables from the knowledge that the inequalities
$-x \leq 0$ and $-y \leq 0$ are tight.
The fifth and sixths constraints are active,
which means $s_5=0$ and $s_6=0$,
from which we deduce that $x=0$ and $y=0$.
If $x=0$ and $y=0$,
then the other inequalities are maximally slack and thus $s_1=3$, $s_2=4$, $s_3=7$, $s_4=5$.
\subsubsection{Choose the pivot 1}
In the first step we choose to increase the $x$-variable since it leads to the bigger gain in $g(x,y)$
($3000$ per unit step),
as compared to an increase in the $y$-direction ($2000$ per unit step).
Next we must choose the pivot row.
How big of a step can we make in the $x$-direction before hitting some constraint?
To calculate the maximum allowed step in each direction,
you must divide the current slack available by the $x$ coefficient in each inequality:
\begin{verbatim}
>>> [ M[i,8]/M[i,0] for i in range(0,4) ]
[3, oo, 7/2, 5]
\end{verbatim}
\noindent
The above magical python invocation computes the largest step in the $x$ direction allowed by constant in each constraint.
Since the step size of the first row (index 0) is the smallest,
we must choose the first row as our pivot.
\subsubsection{Change of variables 1}
Carry out the necessary row operations to eliminate all other numbers in the $x$ column.
\begin{verbatim}
>>> M.row(2, lambda v,j: v-2*M[0,j] )
>>> M.row(3, lambda v,j: v-1*M[0,j] )
>>> M.row(4, lambda v,j: v+1*M[0,j] )
>>> M.row(6, lambda v,j: v+3000*M[0,j] )
>>> M
[1, 0, 1, 0, 0, 0, 0, 0, 3] (active)
[0, 1, 0, 1, 0, 0, 0, 0, 4]
[0, 1, -2, 0, 1, 0, 0, 0, 1]
[0, 1, -1, 0, 0, 1, 0, 0, 2]
[0, 0, 1, 0, 0, 0, 1, 0, 3]
[0, -1, 0, 0, 0, 0, 0, 1, 0] (active)
[0, -2000, 3000, 0, 0, 0, 0, 0, 9000]
\end{verbatim}
\noindent
Again, the state of the linear program can be deduced from the information about which constraints are currently active.
Since the first constraint is active ($s_1=0$),
we know $x=3$ and sixth inequality hasn't changed so we still have $y=0$.
\subsubsection{Choose the pivot 2}
This time we see there are gains to be had if we increase the $y$-variable,
so we check which constraint will become active first as we increase the $y$-variable.
Note we \texttt{M[i,1]} corresponds to the second column of the \texttt{Matrix} object \texttt{M}
because \texttt{Python} uses $0$-based indexing.
\begin{verbatim}
>>> [ M[i,8]/M[i,1] for i in range(0,4) ]
[oo, 4, 1, 2].
\end{verbatim}
\noindent
The third row (index \texttt{2}) is the corresponds to the constrain that allows the smallest step
in the $y$-direction, so we must use the third row as our pivot.
\subsubsection{Change of variables 2}
The next step is business as usual: we use a set of row operations to eliminate all the
$y$-coefficients above and below the third row.
\begin{verbatim}
>>> M.row(1, lambda v,j: v-1*M[2,j] )
>>> M.row(3, lambda v,j: v-1*M[2,j] )
>>> M.row(5, lambda v,j: v+1*M[2,j] )
>>> M.row(6, lambda v,j: v+2000*M[2,j] )
>>> M
[1, 0, 1, 0, 0, 0, 0, 0, 3] (active)
[0, 0, 2, 1, -1, 0, 0, 0, 3]
[0, 1, -2, 0, 1, 0, 0, 0, 1] (active)
[0, 0, 1, 0, -1, 1, 0, 0, 1]
[0, 0, 1, 0, 0, 0, 1, 0, 3]
[0, 0, -2, 0, 1, 0, 0, 1, 1]
[0, 0, -1000, 0, 2000, 0, 0, 0, 11000]
\end{verbatim}
\noindent
The first and the third row have become the active constraints,
which means $x= 3$ and $2x+y= 7$.
The full state of the algorithm is
\[
x=3, \quad y=1, \quad s_1=0, \quad s_2=3, \quad s_3=0, \quad s_4=1.
\]
% \noindent
% \texttt{TODO: check $s_2=3$ ??}
% \ \\