• Python - working with xml/lxml/objectify/schemas, datatypes, and assign

    From aapost@21:1/5 to All on Tue Jan 3 22:57:02 2023
    I am trying to wrap my head around how one goes about working with and
    editing xml elements since it feels more complicated than it seems it
    should be.. Just to get some feedback on how others might approach it
    and see if I am missing anything obvious that I haven't discovered yet,
    since maybe I am wandering off in a wrong way of thinking..

    I am looking to interact with elements directly, loaded from a template, editing them, then ultimately submitting them to an API as a modified
    xml document.

    Consider the following:

    from lxml import objectify, etree
    schema = etree.XMLSchema(file="path_to_my_xsd_schema_file")
    parser = objectify.makeparser(schema=schema, encoding="UTF-8")
    xml_obj = objectify.parse("path_to_my_xml_file", parser=parser)
    xml_root = xml_obj.getroot()

    let's say I have a Version element, that is defined simply as a string
    in a 3rd party provided xsd schema

    <xs:element name="Version" type="xs:string" minOccurs="0">

    and is set to a number <Version>2342</Version> in my document

    The xml file loads with the above code successfully against the schema

    But lxml objectify decides the element type is Int, and the pytype is int..

    Version <class 'lxml.objectify.IntElement'>
    Version.pyval <class 'int'>

    Let's say I want this loaded into a UI with a variety of dynamically
    loaded entry widgets so I can edit a large number of values like this
    and of many other different types.

    I can assign in one of two ways (both resulting the same)
    xml_root.Version =
    xml_root['Version'] =
    (if there is some other more kosher way of assignment, let me know)

    I can assign "2342" and the element suddenly becomes a <class 'lxml.objectify.StringElement'>

    I can assign 1.4 and the element suddenly becomes a <class 'lxml.objectify.FloatElement'>

    The schema does not check during this assignment, it could be invalid,
    like assigning "abc" to a xs:dateTime and it does so any way.
    The original value is lost. The only way I see to verify against the
    schema again is to do so explicitly against the whole root.

    schema.validate(xml_root)

    This returns False because of the added xmlns:py, py:pytype stuff, I can
    strip those with: objectify.deannotate(xml_root[etree.QName(xml_root.Version.tag).localname], cleanup_namespaces=True)

    and get back to schema.validate(xml_root) validating True. BUT, it
    validates True whether the element is a String, Int, Float, etc (so long
    as it 'could' potentially be a string or something..).. So let's say a
    Version is 322.1121000, should be a string, validates against the schema
    as string, but is now 322.1121 (much more relevant for something like a
    product identification number)

    If it is a case where the validate remains False, I then have to
    manually look at the error log via schema.error_log for something like this:

    api_files/Basic:0:0:ERROR:SCHEMASV:SCHEMAV_CVC_DATATYPE_VALID_1_2_1:
    Element '{nsstuff}StartTime': 'asdfasdfa' is not a valid value of the
    atomic type 'xs:dateTime'.

    Then I have to consider how I should reject the users input.. From a UI
    design standpoint it just seems like a lot of added steps, and redundant
    work on top of a object layer that doesn't really do anything other than
    give me a thumbs up on the way in and a thumbs up on a way out. Rather
    than interacting with an object that can say your change is schema
    approved or not from the get-go, I instead seem to have to parse 100000+
    lines of xsd and design UI interaction much more situationally and
    granularly to assert types and corner cases and preserve original values
    in duplicate structures, etc..

    My original assumptions when hearing about xml features doesn't seem to
    exist from what I have found so far. Where schema should be the law, if
    my schema says something should be loaded as a string, it should be a
    string (or something close enough, definitely not an int or float), then attempting to assign something to it that doesn't match schema should be
    denied or throw an error. I am sure under the hood it would probably
    have performance draw backs or something.. Oh well.. Back to
    contemplating and tinkering..

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Dieter Maurer@21:1/5 to aapost on Wed Jan 4 15:42:28 2023
    aapost wrote at 2023-1-3 22:57 -0500:
    ...
    Consider the following:

    from lxml import objectify, etree
    schema = etree.XMLSchema(file="path_to_my_xsd_schema_file")
    parser = objectify.makeparser(schema=schema, encoding="UTF-8")
    xml_obj = objectify.parse("path_to_my_xml_file", parser=parser)
    xml_root = xml_obj.getroot()

    let's say I have a Version element, that is defined simply as a string
    in a 3rd party provided xsd schema

    <xs:element name="Version" type="xs:string" minOccurs="0">

    Does your schema include the third party schema?

    You might have a look at `PyXB`, too.
    It tries hard to enforce schema restrictions in Python code.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From aapost@21:1/5 to Dieter Maurer on Wed Jan 4 12:13:47 2023
    On 1/4/23 09:42, Dieter Maurer wrote:
    aapost wrote at 2023-1-3 22:57 -0500:
    ...
    Consider the following:

    from lxml import objectify, etree
    schema = etree.XMLSchema(file="path_to_my_xsd_schema_file")
    parser = objectify.makeparser(schema=schema, encoding="UTF-8")
    xml_obj = objectify.parse("path_to_my_xml_file", parser=parser)
    xml_root = xml_obj.getroot()

    let's say I have a Version element, that is defined simply as a string
    in a 3rd party provided xsd schema

    <xs:element name="Version" type="xs:string" minOccurs="0">

    Does your schema include the third party schema?

    You might have a look at `PyXB`, too.
    It tries hard to enforce schema restrictions in Python code.


    Yes, to clarify, they provide the schema, which is what we use,
    downloaded locally. Basically just trying to remain compliant with their structures that they already define without reinventing the wheel for
    numerous calls and custom types, and in a way that feels more live
    rather than just checking validity at the end of the edits as if I were modifying the XML manually.

    Thank you for the suggestion, PyXB works much more like how I envisioned working with xml in my head:

    xml_root.Version = 1231.32000
    pyxb.exceptions_.SimpleTypeValueError: Type {http://www.w3.org/2001/XMLSchema}string cannot be created from: 1231.32
    xml_root.Version = "1231.32000"

    I will have to do some more testing to see how smooth the transition
    back to a formatted document goes, since it creates a variable for all
    possible fields defined in the type, even if they are optional and not
    there in the situational template.

    Thanks

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From aapost@21:1/5 to aapost on Tue Jan 10 22:15:07 2023
    On 1/4/23 12:13, aapost wrote:
    On 1/4/23 09:42, Dieter Maurer wrote:
    aapost wrote at 2023-1-3 22:57 -0500:
    ...
    Consider the following:

    from lxml import objectify, etree
    schema = etree.XMLSchema(file="path_to_my_xsd_schema_file")
    parser = objectify.makeparser(schema=schema, encoding="UTF-8")
    xml_obj = objectify.parse("path_to_my_xml_file", parser=parser)
    xml_root = xml_obj.getroot()

    let's say I have a Version element, that is defined simply as a string
    in a 3rd party provided xsd schema

    <xs:element name="Version" type="xs:string" minOccurs="0">

    Does your schema include the third party schema?

    You might have a look at `PyXB`, too.
    It tries hard to enforce schema restrictions in Python code.


    Yes, to clarify, they provide the schema, which is what we use,
    downloaded locally. Basically just trying to remain compliant with their structures that they already define without reinventing the wheel for numerous calls and custom types, and in a way that feels more live
    rather than just checking validity at the end of the edits as if I were modifying the XML manually.

    Thank you for the suggestion, PyXB works much more like how I envisioned working with xml in my head:

    xml_root.Version = 1231.32000
    pyxb.exceptions_.SimpleTypeValueError: Type {http://www.w3.org/2001/XMLSchema}string cannot be created from: 1231.32
    xml_root.Version = "1231.32000"

    I will have to do some more testing to see how smooth the transition
    back to a formatted document goes, since it creates a variable for all possible fields defined in the type, even if they are optional and not
    there in the situational template.

    Thanks

    Unfortunately picking it apart for a while and diving deeper in to a
    rabbit hole, PyXB looks to be a no-go.

    PyXB while interesting, and I respect it's complexity and depth, is
    lacking in design consistency in how it operates if you are trying to
    modify and work with the resulting structure intuitively. It was
    developed on Python2 14 years ago, made compatible with python3 late,
    seems like it was trying to maintain vast version compatibility rather
    than getting a needed overhaul, before being abandoned in 2017 after the
    author moved on to more interesting work... I don't blame him, lol.. The community forks are just minor bug fixes currently.

    There are no setValue()/_setValue() functions for SimpleTypes (the bulk
    of your objects) so you can't change their values directly. Assigning to
    them appears to work if they are nested inside a parent that has
    __setattr__ overloaded (as a default resulting structure does when you
    first load a document), but it is a rats nest as far as what happens
    from there. Sometimes it calls .Factory(), sometimes it goes through a
    series of __init__s, but nothing is really clear on what is or is not a
    kosher approach to managing value changes, and my attempts have failed
    so far to see if I could figure out how to encompass those paths in to a
    single _setValue() call.

    Then there are ComplexTypes, with a value called _IsSimpleContent, which indicates whether it is a wrapper for a custom SimpleType, or something
    that does not contain SimpleType data. These DO have _setValue()
    functions IF it contains SimpleType data, where the SimpleType is stored
    in a __content member variable. Assignment on these also appears to work
    but the results aren't good, you need to use _setValue(), or you lose
    things like attributes. It would have been nicer if the structure of the ComplexType was called something else and wrapped all objects with a
    common set of functions.

    The validate functions do not work how one would assume, like they do
    for other libraries, where they go back and verify the data. I believe
    they only function on the way in, because if the data becomes invalid
    through some manual messing with it after the fact, they indicate that
    the data is still valid even when it's not.

    It seems like what I am probably looking for may reside in java with
    JAXB, but nothing really beyond that.

    generateDS, doesn't really offer anything I need from what I could tell
    in messing with it, and by the looks of it, I might have to bite the
    bullet and use the xmlschema library, work with dicts, handle many more
    corner cases on my side, and just let the result be a lot clunkier than
    I was hoping. Unless I find 8 years to redesign the wheel myself, not
    sure I am granted that ability though. Oh well. lol

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Dieter Maurer@21:1/5 to aapost on Wed Jan 11 19:21:06 2023
    aapost wrote at 2023-1-10 22:15 -0500:
    On 1/4/23 12:13, aapost wrote:
    On 1/4/23 09:42, Dieter Maurer wrote:
    ...
    You might have a look at `PyXB`, too.
    It tries hard to enforce schema restrictions in Python code.
    ...
    Unfortunately picking it apart for a while and diving deeper in to a
    rabbit hole, PyXB looks to be a no-go.

    PyXB while interesting, and I respect it's complexity and depth, is
    lacking in design consistency in how it operates if you are trying to
    modify and work with the resulting structure intuitively.
    ... problem with simple types ...

    I use `PyXB` in `dm.saml2` and `dm.zope.saml2`, i.e. with
    the SAML2 schema definitions (which include those
    of XML signature and XML encryption).
    I had no problems with simple types. I just assign them to attributes
    of the Python objects representing the XML elements.
    `PyXB` does the right thing when it serializes those objects into XML.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From aapost@21:1/5 to aapost on Sun Jan 15 20:13:35 2023
    On 1/3/23 22:57, aapost wrote:
    I am trying to wrap my head around how one goes about working with and editing xml elements ... Back to
    contemplating and tinkering..

    For anyone in a similar situation, xmlschema is actually quite nice.

    It didn't have the features I was looking for out of the box, but it
    does have a to_objects function and I have learned quite a bit while
    picking it apart. I am able to patch it to be good enough for my
    requirements.

    Below is the patch for anyone interested:

    #
    # Contribution for the xmlschema & elementpath python modules which are
    # Copyright (c), 2016-2020, SISSA (International School for Advanced
    Studies).
    # All rights reserved.
    #
    # This file is distributed under the terms of the MIT License.
    # See the file 'LICENSE' in the root directory of the present
    # distribution, or http://opensource.org/licenses/MIT.
    #

    # Patching and expansion of the xmlschema.dataobjects.DataElement object features
    # to get the best demonstration, change schema variable to your .xsd,
    and xmlobj to your .xml files
    # then run this as $ python -i filename.py

    from typing import Any, Optional, Union, Tuple
    #from types import MethodType

    class ValueLockedError(Exception):
    def __init__(self, obj, variable_name):
    self.message = "Can't set ." + variable_name + \
    "\nThe object:\n" + str(obj) + \
    "\nis Locked (._locked is set to True)"
    super().__init__(self.message)

    # importing in order necessary for intended monkey patch
    import elementpath.etree as ep_etree

    # Monkey patching additional static functions to the import of elementpath.etree

    # for namespace management of xml.etree.ElementTree code paths (which use
    # the global variable register_namespace._namespace_map for namespace registering)
    def etree_remove_registered_namespace(elem: ep_etree.ElementProtocol,
    uri: str = '') -> None:
    etree_module: Any
    if not ep_etree.is_etree_element(elem):
    raise TypeError(f"{elem!r} is not an Element")
    elif isinstance(elem, ep_etree.PyElementTree.Element):
    etree_module = ep_etree.PyElementTree
    elif not hasattr(elem, 'nsmap'):
    etree_module = ep_etree.ElementTree
    else:
    import lxml.etree as etree_module # type: ignore[no-redef]

    if not hasattr(elem, 'nsmap'):
    if uri in etree_module.register_namespace._namespace_map:
    del etree_module.register_namespace._namespace_map[uri]
    else:
    # TODO research this for better understanding
    # _namespace_map is uri->prefix
    # DataElement.nsmap prefix->uri
    # lxml etree .nsmap ?->?
    # not using lxml anyway so not really an issue as
    # this condition shouldn't be met
    for key, value in elem.nsmap.items():
    # research - can there be multiple instances of uri to prefix?..
    # or are they intended to be 1:1?..
    if value == uri:
    if key in elem.nsmap:
    del elem.nsmap[key]

    #patching
    setattr(ep_etree, "etree_remove_registered_namespace",
    staticmethod(etree_remove_registered_namespace))

    # for namespace management of xml.etree.ElementTree code paths (which use
    # the global variable register_namespace._namespace_map for namespace registering)
    def etree_get_registered_namespaces(elem: ep_etree.ElementProtocol) -> dict:
    etree_module: Any
    if not ep_etree.is_etree_element(elem):
    raise TypeError(f"{elem!r} is not an Element")
    elif isinstance(elem, ep_etree.PyElementTree.Element):
    etree_module = ep_etree.PyElementTree
    elif not hasattr(elem, 'nsmap'):
    etree_module = ep_etree.ElementTree
    else:
    import lxml.etree as etree_module # type: ignore[no-redef]

    if not hasattr(elem, 'nsmap'):
    return etree_module.register_namespace._namespace_map
    else:
    return elem.nsmap # shouldn't be met

    #patching
    setattr(ep_etree, "etree_get_registered_namespaces",
    staticmethod(etree_get_registered_namespaces))

    # for namespace management of xml.etree.ElementTree code paths (which use
    # the global variable register_namespace._namespace_map for namespace registering)
    def etree_register_namespace(elem: ep_etree.ElementProtocol,
    prefix: str = None,
    uri: str = None) -> None:
    etree_module: Any
    if not ep_etree.is_etree_element(elem):
    raise TypeError(f"{elem!r} is not an Element")
    elif isinstance(elem, ep_etree.PyElementTree.Element):
    etree_module = ep_etree.PyElementTree
    elif not hasattr(elem, 'nsmap'):
    etree_module = ep_etree.ElementTree
    else:
    import lxml.etree as etree_module # type: ignore[no-redef]

    if prefix != None and uri != None:
    if not hasattr(elem, 'nsmap'):
    etree_module.register_namespace(prefix, uri)
    else:
    # TODO research this for better understanding
    # _namespace_map is uri->prefix
    # DataElement.nsmap prefix->uri
    # lxml etree .nsmap ?->?
    # not using lxml anyway so not really an issue as
    # this condition shouldn't be met
    elem.nsmap[prefix] = uri

    #patching
    setattr(ep_etree, "etree_register_namespace",
    staticmethod(etree_register_namespace))


    # importing in order necessary for intended monkey patch
    import xmlschema

    # Monkey patching additional instance functions to the import of xmlschema
    # specifically xmlschema.dataobjects.DataElement

    # Instance functions so DataElement object can use above
    elementpath.etree namespace functions
    def register_namespace(self, prefix: str = None, uri: str = None) -> None:
    #root = self.encode(validation='strict')
    root, errors = self.encode(validation='lax')
    if prefix != None and uri != None:
    ep_etree.etree_register_namespace(root, prefix, uri)

    #patching
    setattr(xmlschema.dataobjects.DataElement, "register_namespace", register_namespace)

    def remove_registered_namespace(self, uri: str = '') -> None:
    #root = self.encode(validation='strict')
    root, errors = self.encode(validation='lax')
    ep_etree.etree_remove_registered_namespace(root, uri)

    #patching
    setattr(xmlschema.dataobjects.DataElement,
    "remove_registered_namespace", remove_registered_namespace)

    def get_registered_namespaces(self) -> dict:
    #root = self.encode(validation='strict')
    root, errors = self.encode(validation='lax')
    return ep_etree.etree_get_registered_namespaces(root)

    #patching
    setattr(xmlschema.dataobjects.DataElement, "get_registered_namespaces", get_registered_namespaces)


    # replacing .validate() & .is_valid() on DataElement so that namespaces
    from the DataElement
    # get set to the xml.etree.ElementTree register_namespace._namespace_map
    global when used
    def validate(self, use_defaults: bool = True,
    namespaces: Optional[xmlschema.aliases.NamespacesType] = None,
    max_depth: Optional[int] = None) -> None:
    """
    Validates the XML data object.
    :raises: :exc:`XMLSchemaValidationError` if XML data object is not valid.
    :raises: :exc:`XMLSchemaValueError` if the instance has no schema
    bindings.
    """
    if (self.nsmap and namespaces == None): #added code
    namespaces = self.nsmap #added code
    for error in self.iter_errors(use_defaults, namespaces, max_depth):
    raise error

    #patching
    setattr(xmlschema.dataobjects.DataElement, "validate", validate)

    def is_valid(self, use_defaults: bool = True,
    namespaces: Optional[xmlschema.aliases.NamespacesType] = None,
    max_depth: Optional[int] = None) -> bool:
    """
    Like :meth:`validate` except it does not raise an exception on validation
    error but returns ``True`` if the XML data object is valid, ``False`` if
    it's invalid.

    :raises: :exc:`XMLSchemaValueError` if the instance has no schema
    bindings.
    :raises: :exc:`XMLSchemaValueError` if the instance has no schema
    bindings.
    """
    if (self.nsmap and namespaces == None): #added code
    namespaces = self.nsmap #added code
    error = next(self.iter_errors(use_defaults, namespaces, max_depth), None)
    return error is None

    #patching
    setattr(xmlschema.dataobjects.DataElement, "is_valid", is_valid)


    # replace .tostring() on DataElement to allow for
    xml_declaration/encoding support
    # TODO research more, will likely customize a bit further
    def tostring(self,
    namespaces: Optional[xmlschema.aliases.NamespacesType] = None,
    indent: str = '',
    max_lines: Optional[int] = None,
    spaces_for_tab: Optional[int] = None,
    xml_declaration: Optional[bool] = None,
    encoding: str = 'unicode',
    method: str = 'xml') -> Any:

    if (self.nsmap and namespaces == None):
    namespaces = self.nsmap

    # Serializes the data element tree to an XML source string.
    # root, errors = self.encode(validation='lax')
    root = self.encode(validation="strict") #prefer strict on my output
    just in case..
    return ep_etree.etree_tostring(
    root, namespaces, indent, max_lines, spaces_for_tab,
    xml_declaration, encoding, method)

    #patching
    setattr(xmlschema.dataobjects.DataElement, "tostring", tostring)


    # add get_value function - paired with set_value
    def get_value(self) -> Any:
    print(type(self))
    return self.value

    #patching
    setattr(xmlschema.dataobjects.DataElement, "get_value", get_value)

    # add set_value function
    # assures change meets XMLSchema
    # reverts back on error
    # assumes data meets Schema to begin with, will remain unchanged in the
    end if it is not
    # :raises: :exc:`XMLSchemaValidationError` if XML data object is not
    valid after attempted change
    # :raises: :exc:`XMLSchemaValueError` if the instance has no schema
    bindings.
    # :raises: :exc:`ValueLockedError` if using ._locked and set to True
    def set_value(self,
    value: Any,
    use_defaults: bool = True,
    namespaces: Optional[xmlschema.aliases.NamespacesType] =
    None,
    max_depth: Optional[int] = None) -> None:
    if hasattr(self, "_locked") and self._locked == True:
    raise ValueLockedError(self, variable_name='value')
    else:
    if hasattr(self, "_locked"):
    self._locked = True

    self._set_value_temp_value = self.value

    self.value = value

    if (self.nsmap and namespaces == None):
    namespaces = self.nsmap
    for error in self.iter_errors(use_defaults, namespaces, max_depth):
    self.value = self._set_value_temp_value # revert value back to
    original
    del self._set_value_temp_value #clean up
    if hasattr(self, "_locked"): # unlock before raising if using/exists
    self._locked = False
    raise error # raise error

    # no errors

    del self._set_value_temp_value # clean up
    if hasattr(self, "_locked"):
    self._locked = False # unlock before returning if using/exists

    #patching
    setattr(xmlschema.dataobjects.DataElement, "set_value", set_value)


    # add get_attrib function - paired with set_attrib
    # remove added logic from .get(), requiring explicit matches only
    def get_attrib(self, key: str) -> Any:
    return self.attrib[key]

    #patching
    setattr(xmlschema.dataobjects.DataElement, "get_attrib", get_attrib)

    # add set_attrib function
    # assures change meets XMLSchema
    # reverts back on error
    # assumes data meets Schema to begin with, will remain unchanged in the
    end if it is not
    # :raises: :exc:`XMLSchemaValidationError` if XML data object is not
    valid after attempted change
    # :raises: :exc:`XMLSchemaValueError` if the instance has no schema
    bindings.
    # :raises: :exc:`ValueLockedError` if using ._locked and set to True
    def set_attrib(self,
    key: str,
    value: Any,
    use_defaults: bool = True,
    namespaces: Optional[xmlschema.aliases.NamespacesType] =
    None,
    max_depth: Optional[int] = None) -> Union[bool, Optional[Tuple[bool, str]]]:
    if hasattr(self, "_locked") and self._locked == True:
    raise ValueLockedError(self, variable_name='attrib[' + key + ']')
    else:
    if hasattr(self, "_locked"):
    self._locked = True

    if key in self.attrib:
    self._set_attrib_temp_value = self.attrib[key] # save original
    value if exists
    else:
    self._set_attrib_value_did_not_exist = True # or note if it
    doesn't exist

    self.attrib[key] = value

    if (self.nsmap and namespaces == None):
    namespaces = self.nsmap
    for error in self.iter_errors(use_defaults, namespaces, max_depth):
    if hasattr(self, '_set_attrib_temp_value'):
    self.attrib[key] = self._set_attrib_temp_value # revert value
    back to original if existed
    del self._set_attrib_temp_value
    elif hasattr(self, '_set_attrib_value_did_not_exist'):
    del self.attrib[key] # or just delete if it didn't
    del self._set_attrib_value_did_not_exist
    if hasattr(self, "_locked"):
    self._locked = False
    raise error

    # no errors

    if hasattr(self, '_set_attrib_temp_value'):
    del self._set_attrib_temp_value # clean up
    elif hasattr(self, '_set_attrib_value_did_not_exist'):
    del self._set_attrib_value_did_not_exist # clean up

    # TODO research @property / some or some type of better variable
    binding?
    # self._expand_xDE_attrib_prefix exists if
    expand_xmlschema_DataElement is run
    if hasattr(self, '_expand_xDE_attrib_prefix'):
    setattr(self, self._expand_xDE_attrib_prefix + key, value)

    if hasattr(self, "_locked"):
    self._locked = False # unlock before returning if using/exists

    #patching
    setattr(xmlschema.dataobjects.DataElement, "set_attrib", set_attrib)

    # add del_attrib function
    # assures change meets XMLSchema
    # reverts back on error
    # assumes data meets Schema to begin with, will remain unchanged in the
    end if it is not
    # :raises: :exc:`XMLSchemaValidationError` if XML data object is not
    valid after attempted change
    # :raises: :exc:`XMLSchemaValueError` if the instance has no schema
    bindings.
    # :raises: :exc:`ValueLockedError` if using ._locked and set to True
    # :raises: :exc:`KeyError` if xml tag attribute (.attrib[key]) doesn't exist def del_attrib(self,
    key: str,
    use_defaults: bool = True,
    namespaces: Optional[xmlschema.aliases.NamespacesType] =
    None,
    max_depth: Optional[int] = None) -> Union[bool, Optional[Tuple[bool, str]]]:
    if hasattr(self, "_locked") and self._locked == True:
    raise ValueLockedError(self, variable_name='attrib[' + key + ']')
    else:
    if hasattr(self, "_locked"):
    self._locked = True

    if key in self.attrib:
    self._del_attrib_temp_value = self.attrib[key] # save original
    value if exists
    else:
    if hasattr(self, "_locked"):
    self._locked = False
    raise KeyError("'" + key + "' Attribute does not exist, nothing
    to do")

    del self.attrib[key]

    if (self.nsmap and namespaces == None):
    namespaces = self.nsmap
    for error in self.iter_errors(use_defaults, namespaces, max_depth):
    if hasattr(self, '_del_attrib_temp_value'):
    self.attrib[key] = self._del_attrib_temp_value # attribute
    required, recreate value back to original
    del self._del_attrib_temp_value
    if hasattr(self, "_locked"):
    self._locked = False
    # append informational message to error output
    if hasattr(error, "message"):
    error.message += ":\n\nThe attribute value was returned to
    original state due to error" \
    "\n\nThis error represents the state of this
    element IF the attribute were removed"
    raise error

    # no errors

    if hasattr(self, '_del_attrib_temp_value'):
    del self._del_attrib_temp_value # clean up

    # TODO research @property / some or some type of better variable
    binding?
    # self._expand_xDE_attrib_prefix exists if
    expand_xmlschema_DataElement is run
    if hasattr(self, '_expand_xDE_attrib_prefix'):
    delattr(self, self._expand_xDE_attrib_prefix + key)

    if hasattr(self, "_locked"):
    self._locked = False # unlock before returning if using/exists

    #patching
    setattr(xmlschema.dataobjects.DataElement, "del_attrib", del_attrib)


    # Monkey patching some class methods helpful for learning / troubleshooting @classmethod
    def _show_me_mro(cls):
    return cls.mro()

    setattr(xmlschema.validators.schemas.XsdValidator, "_show_me_mro", classmethod(_show_me_mro))
    setattr(xmlschema.dataobjects.DataElement, "_show_me_mro", classmethod(_show_me_mro))


    schema = xmlschema.XMLSchema("path/to/your.xsd", converter=xmlschema.JsonMLConverter)
    xmlobj = schema.to_objects("path/to/your.xml")


    # creates dot notation naming for all children recursively
    # c_ default prefix for child, a_ default prefix for tag attribute
    # _# numbered suffix for all children starting at 0
    # increases from there if more than 1 child with same name
    def expand_xmlschema_DataElement(xsobj: xmlschema.dataobjects.DataElement,
    child_prefix: str = 'c_',
    attrib_prefix: str = 'a_') -> None:
    xsobj._expand_xDE_child_prefix = child_prefix
    xsobj._expand_xDE_attrib_prefix = attrib_prefix

    # _locked just an idea at the moment, may or may not use this in the end
    setattr(xsobj, "_locked", False)

    # set a class attribute for each xml tag attribute
    # DO NOT change these directly, use set_attrib on the parent class
    which changes .attrib first
    # These are currently just a copy of what is in the .attrib dict
    # Validation has no knowledge of their existence if they are changed
    outside of design
    # TODO research @property / or some type of better variable binding?
    if (xsobj.attrib):
    #print(xsobj.local_name + " has attributes")
    for key in xsobj.attrib.keys():
    setattr(xsobj, xsobj._expand_xDE_attrib_prefix + key,
    xsobj.attrib[key])

    # set a class attribute for each child
    for each in xsobj.iterchildren():
    expand_xmlschema_DataElement(each)
    count = 0
    while(True):
    if hasattr(xsobj, xsobj._expand_xDE_child_prefix +
    each.local_name + "_" + str(count)):
    count += 1
    else:
    setattr(xsobj, xsobj._expand_xDE_child_prefix + each.local_name
    + "_" + str(count), each)
    break

    expand_xmlschema_DataElement(xmlobj)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From aapost@21:1/5 to Dieter Maurer on Sun Jan 15 20:56:21 2023
    On 1/11/23 13:21, Dieter Maurer wrote:
    aapost wrote at 2023-1-10 22:15 -0500:
    On 1/4/23 12:13, aapost wrote:
    On 1/4/23 09:42, Dieter Maurer wrote:
    ...
    You might have a look at `PyXB`, too.
    It tries hard to enforce schema restrictions in Python code.
    ...
    Unfortunately picking it apart for a while and diving deeper in to a
    rabbit hole, PyXB looks to be a no-go.

    PyXB while interesting, and I respect it's complexity and depth, is
    lacking in design consistency in how it operates if you are trying to
    modify and work with the resulting structure intuitively.
    ... problem with simple types ...

    I use `PyXB` in `dm.saml2` and `dm.zope.saml2`, i.e. with
    the SAML2 schema definitions (which include those
    of XML signature and XML encryption).
    I had no problems with simple types. I just assign them to attributes
    of the Python objects representing the XML elements.
    `PyXB` does the right thing when it serializes those objects into XML.

    It does do a lot of good things, and I am sad to see all the good work
    in it not get used, but for me it really boils down to what it can sum
    up itself in a couple comments from the author in it's first file (which
    I appreciate them and their honesty, because those are comments I could
    see myself writing in a similar situation)...

    ######
    class cscRoot (object):
    """This little bundle of joy exists because in Python 2.6 it
    became an error to invoke C{object.__init__} with parameters (unless
    you also override C{__new__}, in which case it's only a warning.
    Whatever.). Since I'm bloody not going to check in every class
    whether C{super(Myclass,self)} refers to C{object} (even if I could
    figure out how to do that, 'cuz the obvious solutions don't work),
    we'll just make this thing the root of all U{cooperative super

    calling<http://www.geocities.com/foetsch/python/new_style_classes.htm#super>}
    hierarchies.
    ######

    ######
    def __init__ (self, *args, **kw):
    # Oh gross. If this class descends from list (and probably
    dict), we
    # get here when object is *not* our direct superclass. In that
    case,
    # we have to pass the arguments on up, or the strings don't get
    # created right. Below is the only way I've figured out to
    detect the
    # situation.
    #
    # Note that we might also get here if you mix-in a class that used
    # object as a parent instead of cscRoot. Don't do that.
    Printing the
    # mro() is a decent way of identifying the problem.
    ######

    using that suggestion you can see that on simple types
    pyxbxmlroot.SomeString._mro()
    [<class 'pyxb.binding.datatypes.string'>, <class 'pyxb.binding.basis.simpleTypeDefinition'>, <class 'pyxb.binding.basis._TypeBinding_mixin'>, <class 'pyxb.utils.utility.Locatable_mixin'>, <class 'pyxb.utils.utility._DeconflictSymbols_mixin'>, <class 'pyxb.binding.basis._DynamicCreate_mixin'>, <class 'pyxb.cscRoot'>,
    <class 'str'>, <class 'object'>]

    it has a python type that it sends all the way up right next to object,
    when that doesn't actually occur until after simpleType in class string (basis.simpleTypeDefinition, str):

    This makes the object dependent on it's parent, since it itself IS the
    value, I can't assign to or do anything to it by itself, or it and all
    the other stuff goes away. As designed it is very hard to change
    anything in it without breaking something.

    After working with xmlschema, it pretty much confirmed my assumptions
    that it doesn't need to be that way. I was able to follow what was going
    on and tweak xmlschema fairly easily.

    That and the fact that PyXB was abandoned 5-6 years ago make it a strong
    no-go to use in a project. It would need to be adopted with fresh
    development, stripped of the python2 stuff, and the object structure
    redesigned in a more uniform way with functionality properly
    containerized instead of all stuffed together...

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Dan Kolis@21:1/5 to All on Thu Jan 19 07:43:26 2023
    Editing text intended primarily for machine reading that involves metadata and lower level facts is a horror show.

    I sort of worked for a company years ago and a smart ass suggested I was making labor for myself by doing changes to a scripting language for db users, maybe a few hours a week. He suggested I leave them to do it themselves in XML.

    I tried it and the community went ape-shit angry. NONE of the people succeeded routinely at all.

    Just because machine readable nomenclatures as well known and some s/w edits them, doesn't mean there suddenly not mostly mini computer programs.

    I suspect there isn't an easy way out, and probably the thing your making has to be 100% usably done before a maintenance tools can be created to make it easy, anyway.

    Its the way it is, maybe,
    Daniel B. Kolis

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)