Persistent document numbering

Persistent document numbering
0

#1

I am using Python to create a prototype business document. The document numbering needs to be sequential and stored by the program so that numbering persists from one use of the script/application to the next. Where might I find example code for the most common ways of storing and securing document numbering with Python and Java? Invoice numbering would be similar. Would you use a .txt file, JSON, pickle, shelve or even sqlite3? Is there an ISO standard for document numbering and handling?


#2

It’s been a while since I’ve done any quality control implementation, so please take this with a grain of salt. Broadly speaking though, the ISO 9000 family has many standards for Quality Management and many of them are focused on Document Management Systems. The ISO 27000 family of Information Security Management Systems also has a lot to say on the subject. Largely speaking it’s going to come down to whatever ISO implementations the accrediting body (or bodies) certifying your business require. It’s usually easier, in my opinion anyway, to turn to the QA/QC documents those bodies issue rather than trying to find all the ISO standards for document control yourself…that’s a deep, dark hole to go down.

Judging by the examples you picked, I bet you’ve already found this. Just in case though, here’s what the Python Standard Library has to say about data persistence. What methods you use is going to depend on what standards (and level of information security) you choose to adhere to. I would note it is almost always frowned upon to store any personally identifying information in unencrypted text files.

Sorry I can’t point to any code examples for such practice. All the places I’ve worked for have used commercial document management systems solely to avoid the standards/security headache that comes with it.


#3

Thank you very much for such a professional reply.
I am very much a newbie with respect to actual coding so I know that I am probably reinventing the wheel (with all the associated headaches), but that’s ok to me, since I can learn from it and already I have encountered a few gotchas and found workable solutions either on the internet or on my own with a little internet researching. I may still need an experienced coder to reno my reno farther down the road.
I haven’t yet found exactly what I’m looking for with respect to document numbering, although I have seen a few examples on Stackflow that involved a document number as an attribute of a document object created from an Invoice class. It is starting to appear to me that the document number is usually saved to “disk” as an attribute of an object rather than saved as a single item. So, I will have to learn/brush up on my OOP knowledge and skills for Python and Java as well as working with dictionaries in Python, pickle(?), shelve, JSON, YAML and maybe even sqlite3.
I am currently trying to move from saving a CSV file containing a single record with the document number as one of the fields, to using a hack that involves creating and updating a setting in an .ini file. I know it is very insecure and my incremental document numbering code is a hack, but hopefully it is only temporary and just a learning experience. Try not to wince with pain when you see the following code ( It is very clunky. I need to plan out an object oriented design with proper functions/methods and dictionaries.):
For my Python 3 code, I used some illustrative code from an internet tutorial using configparser to create an ini file. The algorithm and code is very basic functional programming but just slightly above my newbie level of understanding. Understanding how the get_config(path) function/method works is one source of slight perplexity. It tests to see if the path exists or not. If the path doesn’t exist it runs a function named create_config with path as an argument. If the path exists, it reads config and returns it.
Source: Mouse and Python: Python 101: An Intro to ConfigParser(http://www.blog.pythonlibrary.org/2013/10/25/python-101-an-intro-to-configparser/)

My hack reads in the ini file and gets the document_number setting.

document_number = get_setting(path, ‘Settings’, ‘document_number’)

HOLD NOSE HERE

convert document_number string to integer type for integer addition

doc_num = int(document_number)
next_doc = doc_num +1

tried using += but had some trouble, since it isn’t in a loop)

convert integer next_doc to string

new_document_number = str(next_doc)
update_setting(path, “Settings”, “document_number”, new_document_number)

This seems to work, but I feel like a coder in kindergarten and I am embarassed to share it because it is so clunky. Do I really need to use so many new variables just to convert a string to an integer and back? BAD SOLUTION? Also, is my hack an example of interpolation that could lead to a security problem further down the road? Should I create a separate settings.py file with an administrator/manager password, and only use it or import functions from it when I need to change a setting? Is THAT the interpolation security gap? Of course, a hacker could easily read an ini file and alter it. Should I be using SafeConfigParser instead, or just move on to a more professional and secure method whatever that may be?
I am very aware of the need to keep personally identifying information secure either by never storing it, especially in plaintext, or by storing it behind a secure firewall/vault and/or storing only an encrypted/hashed form for comparison purposes.It seems a bit inconvenient to ask the user to always type in the very sensitive information. I would like to find a commercial solution such as Samsung KNOX and whatever Accounting software like QuickBooks uses to store Bank Account numbers and Credit Card numbers etc., but may end up using the Python hashlib module and the cryptography module or Cryptodome (being very aware that my ignorance could easily lead to leaving a big security gap to be exploited by a hacker).
Of course, if the U.S.-based cloud SaaS provider had an open API, none of this would be necessary. However, I am finding it an interesting applied learning experience even if I do end up having wasted my time if a third party commercial solution is released in a few months or next year.


#4

I think you’re on the right path. And don’t be embarrassed, this is a learning forum after all, we’re all here to grow as coders. As for what is going to be the best solution and methodology in the end…it’s hard to say. It really is going to depend on your level of expertise, familiarity with the methods you learn, and most importantly, the specific use case. I don’t think I’m the right person to point you in the correct direction on the security front, so I’m going to avoid guessing. Just make sure you cover all your bases if this concept ever goes into production. Information security is no joke.

As a learning exercise though, maybe I can help. Since you mentioned an object oriented approach I made an example Document class that might hold and manage some of the information in a simple document as well as parse & update a configuration INI file. I attempted to include step-by-step comments for the areas you noted some difficulty with. I should add, I wrote this in Python 3.7. There isn’t really any error handling in this example either.

import configparser
import os

class Document:
    """ Example Document Class that holds and returns
    some simple document information """
    def __init__(self, title, author, text, config_path='DocumentConfig.ini'):
        self.title = title
        self.author = author
        self.text = text
        self.doc_num = None
        self.path = config_path
        self.check_config_path()  # Ensure path exists, else create it
        self.new_config_entry()  # Update config file upon instantiation

    def check_config_path(self):
        if not os.path.exists(self.path):
            # If path does not exist proceed with indented statement
            # INI components are key:value pairs within different 'sections' of the INI file
            # The components work similarly to Python dictionaries
            config = configparser.ConfigParser()  # Instantiate the parser
            config['Config'] = {}  # New Config Section in INI
            config['Config']['Last Document Created'] = ''  # New entry in section
            config['Config']['Total Document Count'] = '0'  # New count entry in section
            with open(self.path, 'w') as cf:
                # Write the new INI file
                config.write(cf)

    def new_config_entry(self):
        """ Rewrites the configuration file with an updated count & title """
        if not os.path.exists(self.path):
            return print('Configuration file not found')
        config = configparser.ConfigParser()  # Instantiate the parser
        # the 'converter' option could be used here to apply str --> int
        config.read(self.path)  # Read the file into the parser
        config['Config']['Last Document Created'] = self.title  # Set the last created doc string to current title
        count = int(config['Config']['Total Document Count']) + 1  # interpret the INI's count as an integer and increment
        config['Config']['Total Document Count'] = str(count)  # Set the new count in the parser as a string
        self.doc_num = count  # update internal variable for future use
        print('Latest Title: ' + self.title + '\nDocument Number: ' + str(self.doc_num))
        with open(self.path, 'w') as cf:
            # Write the new INI file over the old one
            config.write(cf)

Usage:

new_document = Document('Test Document', 'Cicero', 'Lorem Ipsum...')
# Prints: 
# "Latest Title: Test Document
# Document Number: 1"
# This has (hopefully) updated the simple configuration file or
# created a new one to update if it didn't already exist

vars(new_document)
# {'title': 'Test Document',
# 'author': 'Cicero',
# 'text': 'Lorem Ipsum...',
# 'doc_num': 1,
# 'path': 'DocumentConfig.ini'}

The configuration file (DocumentConfig.ini, in this case) looks like this:

[Config]
last document created = Test Document
total document count = 1