Purpose: Suppose you have multiple machines/computers running Debian Linux and you do experiment with them a lot like installing various .deb packages, tweaking some hardware, installing kernels, testing various hardware and then re-installing the system again and the cycle goes on and on. Now the problem with this is that every time you do a re-install you have to download the packages again on each machine.
For example just to get a basic graphical environment running you need to do the following at least:
# apt-get install gnome gdm xserver-xorg xfonts-base
This itself is about 237 MB of downloads based on Debian Etch (4.0) as on 15th April 2008.
Now imagine if you have to do this on each of your five network machines. Won’t it p so much painful to download them again and again, especially if you have a slow Internet connection? However there is a Debian utility called “apt-proxy” by which you just have to install the packages once on a “Host” computer and then the rest of your other computers i.e. “Clients” will download from the Host computer’s hard drive rather from Debian Repository over the Internet.
Difference Proxy Cache Server versus Debian Mirror?
Actually many people (just like me) want to setup Proxy Cache servers but they confuse the terms (or are not aware of) with setting up Debian Mirror. Setting up a Debian Mirror is time and resource consuming activity and you need a dedicated server to do that and at least a 500 GB of hard drive just to store the deb packages.
For more details please refer to the following links:
2. Debian APT HOW-TO Manual
If you came here looking for how to setup a Debian Mirror then I am afraid that this article is not for you.
How do I setup a cache proxy server?
In this article we will use the following terminology:
Host Computer – Which will act as a Debian local repository
Client Computers – Will download packages from the Host computer
Step 1: Install apt-proxy package
Log in to any one your Debian Linux computer on your network and do the following:
# apt-get udpate
# apt-get install apt-proxy
Note: We called this computer a Host computer because we have installed apt-proxy on it. In short, any computer on which you install apt-proxy package becomes the Host computer. You only need to install it on one computer on your network. Also this Host computer should have sufficient amount of disk space to store the packages. Any normal user won’t install more than 10GB of packages but again it depends. This is something that you will have to determine on your own.
You DO NOT need to install apt-proxy package on any of your client computers.
Step2: Configure the apt-proxy-v2.conf file
Now on your host computer edit the apt-proxy-v2.conf file:
# nano /etc/apt-proxy/apt-proxy-v2.conf
This file contains various variables that you can configure. In most of the cases you won’t be required to make any changes to the file. The default settings will just work fine.
However make sure that you have following lines un-commented (if they are not):
;; Server port to listen on
port = 9999
;; Cache directory for apt-proxy
cache_dir = /var/cache/apt-proxy
;; The main Debian archive
;; You can override the default timeout like this:
timeout = 30
;; Backend servers, in order of preference
;; Debian security archive
To un-comment a line means to remove the symbol semi-colon(“;”) from the beginning of the line. For example,
;port = 9999 – This is a comment.
port = 9999 – This is a un-commented line.
You can download my sample apt-proxy configuration file and check against yours in case if you are not sure what is going on.
Once you do that, all you need to do is to restart your apt-proxy daemon like this:
# /etc/init.d/apt-proxy restart
This will take your new settings which you did in the apt-proxy-v2.conf file into effect. This is an IMPORTANT step.
Step 3: Get the IP address of your Host Computer
Before we move ahead to configure the Client computers we need to get the IP address of the Host computer.
You can easily get this information from ifconfig command.
Let’s say that the IP address of your Host computer is 192.168.0.10
We will use the above IP address for the rest of the article. You can substitute with your own IP address of the Host computer.
Step 4: Configure sources.lst file on your Clients
Now on each of your Client computers, edit the sources.lst file to reflect your new Host computer’s IP address that we found in Step 3:
# nano /etc/apt/sources.lst
And change the following lines:
deb http://mirrors.kernel.org/debian stable main contrib
deb-src http://mirrors.kernel.org/debian stable main contrib
deb http://192.168.0.10:9999/debian stable main contrib
deb-src http://192.168.0.10:9999/debian stable main contrib
And similarly do for your Security repositories:
#For Security patchesto
deb http://mirrors.kernel.org/security stable/updates main contrib
#For Security patches
deb http://192.168.0.10:9999/security stable/updates main contrib
There is a high chance that your original sources.lst file may have some other mirror like the following:
deb http://http.us.debian.org/debian/ stable main contrib non-free
In that case too all you need to do is to change the http.us.debian.org part to 192.168.0.10:9999/
Step 5: Ready to go
Now just give the command:
# apt-get update
on all of your client computers on which you modified the sources.list file as explained above. This will result in fetching of the deb packages from your Host computer from now onwards.
Q:How does the Host computer have the deb package that client is requesting?
A: If you did install a particular package on one of your Client computer after you restarted your apt-proxy daemon on your Host computer then your Client computer will download the package from your Host computer.
For example, you do following at your Client computer:
# apt-get install traceroute
Your Host computer will get the package from the Internet (Debian Repository) just like the default (normal) way the Debian system works. Your Host computer will fetch packages in the same way as it use to do before installing apt-proxy. The only difference is that your Host computer now will store (cache) a copy the ‘traceroute’ package in ‘/var/cache/apt-proxy’ directory also.
And then suppose you do the following on your 2nd Client computer:
# apt-get install traceroute
Now your Client computer will fetch the package from your Host computer, specifically from the directory ‘/var/cache/apt-proxy’ which we configured in Step 2 above.
You can actually see the packages that are now being “cached” in the ‘/var/cache/apt-proxy’ directory of your Host computer over time as and when you install more and more packages on your Host/Client computers:
# ls /var/cache/apt-proxy/debian/pool/main/
a/ e/ i/ libb/ libf/ libm/ libr/ libw/ o/ t/ x/
b/ f/ k/ libc/ libg/ libn/ libs/ libx/ p/ u/ y/
c/ g/ l/ libd/ libi/ libo/ libt/ m/ r/ v/ z/
d/ h/ liba/ libe/ libj/ libp/ libv/ n/ s/ w/
The above directory structure is just similar to what we expect to find on Debian repository and hence apt-get from your Client machines can easily fetch them.
Q: What if the package that we request from Client machine is not present on Host machine?
A: Certainly a good question. In that case your Client computer will still request the package from the Host computer and the Host computer will first check in it’s cache i.e. ‘/var/cache/apt-proxy/’ directory for the package requested. Upon not finding, it will download from Internet (Debian Repository) and your Client will then fetch from the Host. This process is almost transparent to the Client user. After this your Host computer will cache the package for future requests from other Clients and will not have to download from Internet.
Q: What if I installed a package on Host computer, will that get cached too?
A: At first I thought it should do, but I tested out on my system and it seems that it does cache it. But I am sure there might be a way to do that. Read the documentation and man pages for this. Update: Actually you can do this. You need to use the tool ‘apt-proxy-import’ and you can import your entire apt-get cache directory contents (debs) into the apt-proxy cache. The apt-get cache’s default location is ‘/var/cache/apt/archives/’.
Q: There is something called apt-cacher also which does the same. What is the difference between apt-cacher and apt-proxy?
A: Yes. apt-cacher and apt-proxy do the same job. The major difference is that in apt-cacher you don’t need to change the sources.lst file on your Client computers. All you need to do is to create/change a file at ‘/etc/apt/apt.conf.d/01proxy’. See more on this page.
Q: Some of my clients uses stable repository and other uses testing/unstable repository. Will apt-proxy still work?
A: Yes, it will. It will basically cache any package that your client requests (if the repository is enabled in the apt-proxy-v2.conf file). Also, it does not matter if your Host computer uses stable/testing/unstable, it will still function properly. You can also use combination of Debian and Ubuntu repository by un-commenting the proper lines in the apt-proxy-v2.conf file.
1. If you change the default location of cache directory from /var/cache/apt-proxy to something else like ‘/root/proxy/cache’ in apt-proxy-v2.conf file and restart the daemon, it won’t work. At least it my case, I was not able to get it working. So I resorted to the default location. May be it is a permission issue and we may need to give apt-proxy permission to write in ‘/root/’ directory.
Keep tuned-in: More on this topic in my future posts.
As usual, please leave a comment/feedback if you have any. Comments encourages bloggers to post more and keep their spirits high.