I fail at tech support

Without the slightest trace of shame, I consider myself being very good at troubleshooting technical issues that arise on my own systems. I can dig rather deeply, analyze process calls and determine which program is loading what module that’s faulting. I pride myself at being able to solve issues without resolving to reformatting, which in my opinion, isn’t a real solution.

However, when someone rings and tells me, “my computer just blue-screened on me”, I find myself at a lost on how to react. More often than not, I find myself unable to come to a solution remotely.

I’ve been running this through my head because, you guessed it, I received just one such call tonight. I believe a large part my problem is due to the fact that I know very little about what goes on in other people’s computers. Contrast this to my own setups, where I am cautious of, and maintain and a good inventory of what applications I install. Thus, when an error occurs, it is significantly easier for me to backtrack and reproduce the problem.

Another contributing factor is freedom of action. I have full control over my own systems and even the network it resides within. This kind of liberty is often not present when dealing with other people’s computers.

To make complicate matters, people do poorly when it comes to describing the exact problem they’re facing. Having the exact error message, especially in blue screen situations, can go a long way in solving the problem. Although most aren’t very specific, some error messages put a high probability on the fault being hardware related other than software (PAGE_FAULT_IN_NONPAGED_AREA comes to mind), greatly narrowing down the source of the problem.

I’m curious as to how others respond when another individual highlights a problem to them. How do you go about gathering as much information about the situation and the events leading up to it as possible in order to make a few educated guesses as to where the problem lies? Is there is standard operating procedure that you follow?

Setting up vsftpd on Ubuntu

Yet another adventure in Linux land. This time, it was largely due to issues with iptables.

First of all, installing vsftpd was rather easy. Issuing aptitude install vsftpd with superuser privileges was all. Configuration didn’t take all that much work either, but it dragged on due to this being my first time and having to look up the manual.

The configuration for vsftpd is located at /etc/vsftpd.conf. Out of the box, vsftp allows anonymous read access. The first order of the day was to disable that by changing the variable.

anonymous_enable=NO

Users in vsftpd can either be local accounts, or vsftpd specific ones. I used local accounts to save myself the hassle since all my local accounts have their permissions set correctly already and what not.

local_enable=YES

By default, all user accounts (including local) are only granted read access. Adding the following line enables write and modify permissions.

write_enable=YES

At this stage. vsftpd is configured and ready to run. However, if you’re behind a firewall, some additional configuration needs to be done to allow vsftp listen and pass through it. To the novice who is inexperienced with iptables like myself, this turned out to be quite an adventure. Before we get to that however, it is imperative to understand how FTP works.

FTP works in two modes, active and passive. A detailed and extremely useful explanation of the difference between the two can be found on http://slacksite.com/other/ftp.html. Briefly, in active FTP, the client initiates a connection to the listening server’s control port, and when data transfer is required, the client opens up a port and lets the server know via the control channel which port on the client side to send the data to. Thus, it requires the client open a port in the listening state. This doesn’t quite work if the client is behind a firewall or NAT. In passive FTP, the client initiates the connection to the server’s control port. The server then dynamically opens up another port and lets the client know this port number via the control channel. The client would then initiate a connection to this port for data transfer. In summary, active FTP requires an open port on both the client and server side, and passive ftp requires two open ports on the server side while none on the client’s.

While the server listens on port 21 constantly for incoming connections, and we can specify that as such in the firewall, the dynamically created port creates an issue since it’s open only when required, and is usually a random port in a port range. Having all these ports constantly open is not feasible as it’d be a security issue. To get around this, the firewall has to be stateful react accordingly.

Two things need to be done. First, add the stateful rule in iptables.

-A INPUT -p tcp -m state --state NEW --dport 21 -j ACCEPT

Next, we’re going have to enable the module that allows connection tracking for FTP so that our stateful rule would work. This can be done simply by running modprobe ip_conntrack_ftp as superuser. In order to make command permanent however, we need to add it to the /etc/modules file so that it runs on startup.

Took a bit of work, but we now have a fully functional FTP server. My adventures in Linux land continues.

Test-Signing Drivers

Previously, I wrote about how not having signed drivers can be quite a pain on a 64-bit Windows system. I remedied that and made it less of a pain today.

Microsoft provides a set of tools in it’s Windows Driver Kit for the test-signing of drivers to be used for development purposes. What this means in simple terms is that it provides a way for the self-signing of drivers, and thus, getting the system to accept it as though it were digital signed by MS. This would avoid having to disable driver signature enforcement on start-up each time.

To begin with, download the Windows Driver Kit, and install the build environment and tools. Once that is done, launch the x64 Free Build Environment shortcut from the shortcuts created in the start menu with administrative rights. In my case, I made a folder consisting of my extracted raid drivers which look like this:

rr174x.cat
rr174x.inf
rr174x.sys

Now, to create the test certificate, we run the following:

makecert -r - pe -ss PrivateCertStore -n CN=mythokia.net(Test) TestCert.cer

Where mythokia.net(Test) can be replaced by any name. ‘Suceeded’ would be echoed upon successful execution of the above. That being done, we proceed to install the certificate on the machine as a Trusted Root Certificate Authority and Trusted Publishers so that items signed by this particular certificate would be recognized.

certmgr /add TestCert.cer /s /r localMachine root
certmgr /add TestCert.cer /s /r localMachine trustedpublisher

‘CertMgr Succeeded’ should be echoed for each. Now to sign the drivers with our certificate. This can be done either by signing the catalog file (one with the .cat extension), and/or the embedding the signature directly into the binary. From Microsoft’s explanation, drivers loaded at boot time are required to have their signatures embedded in the driver’s binary file itself. Unsure if signing just the binary is sufficient, I went ahead and did both.

signtool sign /v /s PrivateCertStore /n mythokia.net(Test) /t http://timestamp.verisign.com/scripts/timestamp.dll rr174x.cat
signtool sign /v /s PrivateCertStore /n mythokia.net(Test) /t http://timestamp.verisign.com/scripts/timestamp.dll rr174x.sys

Watch the output to see if both were successful. We’re almost ready to install the driver now, but one last twist. The bootloader has to be configured to allow for the running of test drivers. We issue this:

bcdedit -set TestSigning on

This adds an unobtrusive watermark to the bottom right of the screen that says ‘Test Mode’ that I can live with. Besides, I run my server headless anyway, except for the occasion RDP into it.

Now all that is done, we can finally install our self-signed driver like you would a normal driver. No more manually disabling the enforcement of digitally signed drivers every boot up.

Once again, these steps are detailed, and a lot more thoroughly so on MSDN, but here’s the rough guide to the self-signing of drivers for use on Windows x64 systems. I can finally remove that keyboard from my server.

Win 2008 R2 and signed drivers

After replacing yet another failed disk in my raid array this weekend, I replaced the Windows Vista installation on it with Windows Server 2008 R2, released last week.

I overlooked one important factor before installing Server 2008 R2 on it – I did not have signed drivers for my HighPoint 1740 raid controller. I had assumed, and wrongly so, that the drivers which had worked on Vista x64 would continue to do so in Server 2008 R2 (Win 2008 R2 is only available in x64 flavors), which is only partially the case.

On all x64 versions of Windows, drivers have to be digitally signed. I guess the logic behind this is for reasons and stability and compatibility. You could however, still force an install of a non-signed driver. The result however, could be some annoyance.

Afte the installation of the raid drivers, Windows refused to start and instead, booted into a recovery state. It was only then that I discover that although most drivers that worked under Vista/Server 2008 will work on Windows 7/Server 2008 R2, the signature is only valid for the particular version of the operating system they’re signed for.

To get around this, driver signature enforcement would have be disabled at each start up. The way this is done is to hit F5 right after the BIOS POST screen and before Windows start, and then hit F8 to bring up the advance options, and select disable driver signature enforcement. Troublesome.

There is yet another alternative, which I have not explored. The Windows Driver Kit provides a way to self-sign drivers for testing purposes. The MSDN article on how to go about doing this is here. I’ll have to look into it when I more time at my disposal.

High-tech manual labor

I had the opportunity of doing some work in a HR-like department recently which handles the manpower administration for a military unit. It is one of those rare places in the military where you get see technology, in the form of computers systems and networks, being employed.

A particular subset of the work there involves generating reports for soldiers being released from service. The system involved doing data entry from a couple of different documents into a web-based form, and the downloading the completed report in a word document, and then opening the document and doing a lot of formatting, before finally printing it out. Then do the similar for maybe about a hundred over documents. In other words, it’s a laborious job, and one that would likely qualify for an entry on thedailywtf.com.

When the process was being explained and shown to me, voices in my head sighed. Coming from a sysadmin/programming background, one important thing you learn is to automate whatever you can, especially repetitive tasks such as this. This was a prime candidate for scripting action, and an area where Visual Studio’s integration with MS Office could be set to good use. It’s a pity that the computer was locked down rather tightly without any chance of doing so though.

Sad.

Saved by RAID 5

Raid 5 rebuilding

I was greeted by a stream of loud and high pitched beeping from my server when I returned home on Friday. Panic set in that very moment, but it didn’t last. Within seconds, I had ascertained the source of the problem, and although it was worrisome one, it wasn’t the end of the world, at least not quite yet.

My first thought was “Doh! Server’s overheating!” but cooler minds prevailed. If the server had indeed overheated, it would have shut itself down instead of beeping madly, and it was still responsive. Well, I thought, the only other thing in the server that had an embedded beeper aside from the motherboard was the RAID controller card, so it had to be that. I have a RAID 5 array, consisting of four hard disks set up on the server, and a quick check showed that the status of the array was ‘critical’, with one drive failing.

In a RAID 5 array, a number of similar capacity drives are used, with the equivalent of one of the drive’s capacity being used for storing parity bits. Thus, a RAID 5 array can sustain the loss of one drive before data is irrecoverably lost.

So what did I do? I ran out the next day, bought myself a new 500 GB drive for SGD$87, replaced the failing drive, and a disaster that would have involved a very painful loss of data was averted. Since my RAID controller (a rather dated Highpoint RAID 1740) supported online rebuilding, no downtime was incurred at all.

Lesson to be taken from this: Data is precious and you never know when a drive could fail on you. If your entire life resides digitally on your computer, make sure you have a backup plan.