Tag Archive for 'Python'

Zettelkasten and Writing with Joplin, BPG Fonts, Aider, Ollama, Deepseek r1 14B

This is my first attempt at weekly posts. I created an organizational schema and setup the files to begin the work. One of the things this week that I accomplished was the use of Aider to create a rapid prototype of a paired comparison analysis tool that works on the console in any operating system that uses Python. I used Ollama with Deepseek R1 14B running locally as the backend model. The code for version 25.26.12.20153 is accessible on my website.

The idea of creating an economics website with a spiritual element began to intrigue me quite some time ago. It satisfies several stipulations related to the use of my time in the future. After some experimentation, adding images via Zettlr, which is the word processor that I am using, is cumbersome. I could add them another way or in another program, but this program inspires me to write. I have finally settled on simply using Joplin because I am aging daily and have less time than in the past due to my long commute.

Part of my inspiration for this post today results from the 27 December 2025 issue of Coffee and Covid by Jeff Childers. In that issue, he details his writing and organization process. I have several hundred megabytes worth of notes in Joplin.  I migrated many notes to Obsidian, but now I want them back. With Joplin, one may right click a note and copy a markdown link to use within another note.  That procedure is less efficient than Zettlr’s ability to start typing a colon and then select the note from a list that filters the notes based on what one types.  I changed font families to the following:

Editor font family: BPG Courier GPL&GNU

Editor Monospace font family: BPG Courier S GPL&GNU

Viewer and Rich Text Editor font family: BPG Serif GPL&GNU

This allows me to see a preview of my writing in a serif font which helps me write more effectively. Joplin automatically exports a backup of all the files in a single file daily.  I need a second machine configured to export these and individual files in case something happens and the collective archive file fails.

 

Paired Comparison Analysis

This is a simple paired comparison analysis to compare a list of items amongst themselves to find a ranking for decision making.

 

#!/usr/bin/env python3
#######################################################
# Paired Comparison Analysis
# webmaster@memorymatrix.cloud
# 25.26.12.2053
#######################################################

import sys
import logging


def create_lists(list_a=None, list_b=None):
    """
    Create two lists of items from user input

    Args:
        list_a (list): Initial items for List A (optional)
        list_b (list): Initial items for List B (optional)

    Returns:
        tuple: Two lists (A and B), populated with items

    Raises:
        TypeError: If invalid items are provided
    """
    try:
        if list_a is None:
            list_a = []
        if list_b is None:
            list_b = []

        # Populate List A if not provided
        while True:
            item = input("Enter an item for List A (press Enter to stop): ")
            if not item:
                break
            list_a.append(str(item))

        # Copy List A to List B
        list_b = list(list_a)

        logging.info("Lists created successfully")
        return list_a, list_b

    except KeyboardInterrupt:
        print("\nUser interrupted input")
        sys.exit(1)
    except Exception as e:
        logging.error(f"Error creating lists: {str(e)}")
        raise


def count_preferences(comparison_results):
    """
    Count how many times each item was preferred

    Args:
        comparison_results (list): List of tuples from compare_items()

    Returns:
        dict: Dictionary mapping items to their preference counts

    Raises:
        ValueError: If invalid results are provided
    """
    try:
        if not comparison_results:
            raise ValueError("No comparison results provided")

        # Initialize count dictionary
        counts = {}

        for result in comparison_results:
            preferred_item = result[2]
            if preferred_item == 1:
                counts[result[0]] = counts.get(result[0], 0) + 1
            elif preferred_item == 2:
                counts[result[1]] = counts.get(result[1], 0) + 1

        return counts

    except Exception as e:
        logging.error(f"Error counting preferences: {str(e)}")
        raise

def compare_items(list_a, list_b):
    """
    Compare unique pairs of items between two lists and store preferences

    Args:
        list_a (list): First list of items
        list_b (list): Second list of items

    Returns:
        list: Results of comparisons

    Raises:
        ValueError: If lists are empty or mismatched
    """
    try:
        if not list_a or not list_b:
            raise ValueError("Both lists must contain items")

        results = []


        # Generate unique pairs (a, b) where a is from A and b is from B
        # Skip comparisons where items are the same or already compared in reverse order
        for item_a in list_a:
            for item_b in list_b:
                # Skip self-comparisons and reverse comparisons
                if item_a == item_b or item_a > item_b:  # Using '>' to sort alphabetically
                    continue

                try:
                    preference = input(f"Compare {item_a} vs {item_b}: "
                                       f"Enter 1 if you prefer {item_a}, "
                                       f"2 if you prefer {item_b}: ")

                    if not preference.isdigit():
                        print("Invalid input. Please enter 1 or 2.")
                        continue

                    results.append((item_a, item_b, int(preference)))

                except KeyboardInterrupt:
                    print("\nUser interrupted comparison")
                    return results  # Return what we have so far

    except Exception as e:
        logging.error(f"Error during comparison: {str(e)}")
        raise
    return results

if __name__ == "__main__":
    """
    Main program entry point with command line arguments
    """
    try:
        # Configure logging
        logging.basicConfig(
            level=logging.INFO,
            format='%(asctime)s - %(levelname)s - %(message)s',
            handlers=[logging.StreamHandler()]
        )

        # Get items from command line or user input
        if len(sys.argv) >= 2:
            list_a = [str(sys.argv[1]), str(sys.argv[2])]
        else:
            print("No command line arguments provided.")
            first_item = input("Enter the first item for List A: ")
            list_a = [first_item]

        list_b, _ = create_lists(list_a=list_a)

        results = compare_items(list_a, list_b)

        # Get preference counts
        preferences = count_preferences(results)

        print("\nComparison Results:")
        for res in results:
            print(f"Comparing {res[0]} vs {res[1]} - Preferred: {res[2]}")

        print("\nPreference Counts:")
        for item, count in preferences.items():
            print(f"{item} was preferred {count} times")

    except IndexError:
        # Handle cases where lists are too short
        print("Error: Not enough items provided. At least two items required")
        sys.exit(1)
    except KeyboardInterrupt:
        print("\nProgram interrupted by user")
        sys.exit(0)


# Unit Tests
# to run this, go the source directory venv/bin folder, and use source ./activate
# then python3 -m pytest script-name.py
def test_create_lists():
    """
    Test create_lists function with different scenarios
    """
    from unittest.mock import patch

    @patch('builtins.input')
    def test_default_case(mock_input):
        mock_input.side_effect = ['apple', 'banana', '', 'berry']
        list_a, list_b = create_lists()
        assert len(list_a) == 3
        assert list_b == list_a

    @patch('builtins.input')
    def test_single_item(mock_input):
        mock_input.side_effect = ['test', '']
        list_a, list_b = create_lists()
        assert len(list_a) == 1
        assert list_b == list_a


def test_compare_items():
    """
    Test compare_items function with various scenarios
    """


def test_count_preferences():
    """
    Test count_preferences function with various scenarios
    """
    from unittest.mock import patch

    @patch('builtins.input')
    def test_valid_comparison(mock_input):
        mock_input.side_effect = ['1', '2']
        results = compare_items(['a'], ['a', 'b'])
        assert len(results) == 1

    @patch('builtins.input')
    def test_invalid_input(mock_input):
        mock_input.side_effect = ['3', '1']
        results = compare_items(['a'], ['a', 'b'])
        assert len(results) == 1

    def test_count_preferences():
        """
        Test count_preferences function with various scenarios
        """
        from unittest.mock import patch

        @patch('builtins.input')
        def test_valid_comparison(mock_input):
            mock_input.side_effect = ['1', '2']
            results = compare_items(['a'], ['a', 'b'])
            preferences = count_preferences(results)
            assert len(preferences) == 1
            assert preferences.get('a', 0) == 1

        @patch('builtins.input')
        def test_multiple_comparisons(mock_input):
            mock_input.side_effect = ['2', '1']
            list_a = ['apple', 'orange']
            list_b = ['pear', 'tomato']
            results = compare_items(list_a, list_b)
            preferences = count_preferences(results)
            assert len(preferences) == 2
            assert preferences.get('pear', 0) == 1
            assert preferences.get('apple', 0) == 1

Data Science Time Warp Machine

Fedora 38 freezes up and crashes sometimes when using Gnome on bare metal.  This may be the result of Gnome reliability issues.  In a previous article I detailed creating a massive repo of Fedora 38, and I still have it.  I will not delete the 238GB repo because Fedora 40 is the last one with Python 2.7 in the repositories.  They elected to completely remove it in Fedora 41 and beyond.  I created some software in Python 2.7 that may never make it to Python 3 because I will be an old man by the time I could complete the conversion relative to my available time in the present day. I had migrated from bare metal to WSL with Fedora 36 a few years ago. I had created my own WSL instance using the Fedora 36 cloud init image, and then upgraded it over the years to Fedora 38 and then ceased updating it.  WSL crashes and cannot be relied upon to run tasks that require many hours of continuous processing.

WSL really was wonderful for development and running Linux applications with underlying Linux features.  I used it for development using Pycharm.  The problem is that I would often return after 12 hours and see a message that the terminal could be closed with a CTRL + D which indicated that the service had stopped for some reason.  I suspect these occurred when available RAM conflicted with the /dev/share features of Linux.  Troubleshooting it would take too long. I don’t trust the releases from the Windows store because forced updates in Windows can take features away or cause unexpected problems.  I upgraded my Windows 11 home desktop to Windows 11 Pro specifically so I could disable Windows automatic updates via group policies, service disablement, and registry modifications that fail to stop auto updates on Windows 11 Home.

To create a long use time capsule of sorts, I decided to switch to Alma Linux 8 from Fedora 38.  Alma Linux 9 follows the tradition of RHEL 9 and removes the easy support for Python 2.

I setup Alma Linux 8.10 Cerulean Leopard, installed from the KDE live DVD, and installed r Studio server to access via web browser.

edit /ect/dnf/dnf.conf and add keepcache=True

dnf install epel-release    
dnf config-manager -enable powertools    
dnf install R    
dnf install python2

The python2 install installs pip2.7 automatically. One calls pip2 via the pip2.7 command.

As regular user the following is required for a script I made because parsedatetime changed after version 2.5 and is no longer compatible with the previous versions.

pip2.7 install parsedatetime==2.5 --user

• Install rstudio-2024.12.0+467-1.rpm from direct download

• Install rstudio-server-rhel-2024.12.0-467.rpm from direct download

systemctl enable rstudio-server

Configure the firewall to allow 8787.

usermod -a -G rstudio-server <username> 
setenforce 0

The last instruction to turn off SELinux is temporary until I can ascertain the specific rules that will need modification to allow it work. With SELinux enforcing with the initial configuration, the server cannot be accessed via web browser remotely

T+n time series analysis

This isn’t stock advice, it’s a data and math project. AAPL is interesting (not investing advice!) from a data perspective. Part of my big data project was the creation of a tool to analyze every symbol regressed on every other symbol for numerous lags. In addition to revealing the vendor’s mangling of data for unknown purposes, that analysis of lags produced at least one independent variable upon which the regression of future AAPL produced a .3 R-squared value. That is a weak or low effect size, but was a genuine discovery which passed hypothesis testing across time horizons. That was Direxion Daily S&P Oil & Gas Exp. & Prod. Bear 2X Shares (DRIP). DRIP was the first independent variable I discovered through the lag analysis. The problem is that the lag analysis is very computational intensive. Then, after doing that, ferreting out the red herrings because of bogus data takes another large amount of time. It seems that my data provider inserts bogus data producing .9 r^2 values between different vectors. These are of course problems that can one can mitigate with better code. That too, takes time. This isn’t a document about that first statistically significant predictor for AAPLt+3.

Keeping the t+ and t- in mind causes some difficulty for me. T+2 means today plus 2. If today is January 5, t+2 is January 7. If we are analyzing t+3, that is January 8. Programming in 0-indexed programming languages produced an inner impulse to count from 0 instead of 1. With zero indexing, the third option is 7 as shown in this example of a list [5, 6, 7, 8]. This is a heavily ingrained impulse that I must both use and mitigate. This project used Python and R and shell scripting.

The first 1 factor model devised using DRIP was as follows:

-2.065(DRIP Typical %change) -.0040 = AAPL Typical %changet+3

This model was tested in Stata using the data in manipulated in Python and R. This formula may be of no use in the future or even now. I might revisit this in discussing lags portion of the project later