Fixing file coding for git

Version controling org-files is one of the best ways to prevent any data loss in the future. Given that these files contain all kind of personal and non-public data, I decided to use Gitea, a git server that can be installed locally using docker.

Once the server was ready, I pushed some initial changes. To my surprise, all the contents of the file were registered as a single line separated by “^M” characters. Changes were registered correctly, but I could not check the changes on the commits, as the line was too long to display them.

Coding problem

I started exploring the problem and discovered “^M” is the character used to encode Carriage Returns (CR). Different OS used different conventions on how to mark new lines. Depending on your file encoding, emacs will do some conversion when saving the file.

The “^M” would mean file was using Classic MacOS encoding (pre-MacOSX 10) which did not make any sense. Was this encoding selected on Doom Emacs configuration? I had no idea, so I asked on the Discord group.

Solving the coding problem

Fortunately, someone shared a link to the relevant section on the emacs manual.

Checking the docs, I executed M-x describe-coding-system and got:

Coding system for saving this buffer:
  U -- utf-8-mac (alias: mule-utf-8-mac cp65001-mac)

Default coding system (for new files):
  U -- utf-8-unix (alias: mule-utf-8-unix cp65001-unix)

Using M-x set-buffer-file-coding-system,` I changed the buffer coding to utf-8-unix and now it is working as expected. Every line appears on a different line when I push in git, so I can check changes easily.

Next steps

The only missing piece for me now is: why was the coding system utf-8-mac instead of the default utf-8-unix? I need to verify other files, to see if they are using the right coding too or not.

P.S This article about end of lines by Tim Clem, a Github worker, explains the challenges on keeping consistency with multiple contributors.

Photo by Markus Spiske on Unsplash

Releted Posts

Transcribe videos using OpenAI Whisper for free

Introduction OpenAI, the company behind GPT-3 and DALL-E 2 has just released a voice model called Whisper that can transcribe audio fragments to multiple languages and translate them to English.

Read more

Setting up doom emacs in Ubuntu 20.04

Introduction In this post I will capture the steps required to install doom emacs in a fresh new install of Ubuntu 20.

Read more

Setting up an AI workstation

Introduction In this document, I will share the steps required to get an AI workstation machine ready. I’ll be updating the content as my configuration evolves.

Read more