C-cedilla -- Special Character

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • ddtpmyra
    Contributor
    • Jun 2008
    • 333

    C-cedilla -- Special Character

    I declare to a variable equivaclent to cedil

    Code:
    DECLARE @TAB CHAR(1)=char(199)
    but everytime I export it it's giving me different char. Is there's a way to write U+00C7 which is equivalent to cedill? You help will be much appreciated.

    thank you.
  • NeoPa
    Recognized Expert Moderator MVP
    • Oct 2006
    • 32662

    #2
    Only the first 128 characters can be assumed to be standard in ASCII across different character sets. 199 is C-cedilla in one at least, but I look at others and find it's something else entirely. You probably need to look at what is interpreting the character exported and which CharSet it's using.

    Comment

    • Rabbit
      Recognized Expert MVP
      • Jan 2007
      • 12517

      #3
      U+00C7 is unicode, hence the U. If you want to use unicode encoding, then you need to use NCHAR. The rest of your data will have to be converted to NCHAR and NVARCHAR if they are not.

      If you want to stay with ASCII, then capital C cedilla is 128 and lower case c cedilla is 135.

      If you need to use any of the other cedillas, you will have to switch to unicode.

      Comment

      • ddtpmyra
        Contributor
        • Jun 2008
        • 333

        #4
        @NeoPa:
        It interprets it as a '€' char.

        @Rabbit:
        Can you give me an example of a select that will use U+00C7?
        Last edited by NeoPa; Nov 29 '12, 03:41 AM. Reason: Replying to each post needn't (shouldn't) require a separate post for each. Merged.

        Comment

        • ddtpmyra
          Contributor
          • Jun 2008
          • 333

          #5
          @NeoPa:
          When I select the temp table inside SQL it is in Cedill form but everytime I export it to a .dat file it's different €.

          I wonder if I'm using the correct BCP format (below) any other suggestions?
          Code:
          SELECT @SQL='BCP testfile OUT '+@FILENAME+' -c -t -T -r '+@@SERVERNAME
          Last edited by NeoPa; Nov 29 '12, 03:43 AM. Reason: [CODE] tags are mandatory. Please don't forget to use them in future.

          Comment

          • NeoPa
            Recognized Expert Moderator MVP
            • Oct 2006
            • 32662

            #6
            The contents of the file are not characters as such. The numbers are interpreted as characters by whatever software is used to view them. What character sets it uses depends on how your system is set up.

            It may well be that Rabbit's suggestion of using unicode variables (NCHAR & NVARCHAR) to hold your data is the way to go. Have you looked into that yet?

            Comment

            • ddtpmyra
              Contributor
              • Jun 2008
              • 333

              #7
              Originally posted by NeoPa
              The contents of the file are not characters as such. The numbers are interpreted as characters by whatever software is used to view them. What character sets it uses depends on how your system is set up.

              It may well be that Rabbit's suggestion of using unicode variables (NCHAR & NVARCHAR) to hold your data is the way to go. Have you looked into that yet?
              Sorry I dont know how to do that. Can you show me how like an example?

              thank you.

              Comment

              • Rabbit
                Recognized Expert MVP
                • Jan 2007
                • 12517

                #8
                To create a unicode string, you use the NCHAR function.
                Code:
                SELECT NCHAR(199)
                But here's the thing, just because you write the data in unicode, unless you're also reading it in unicode, it won't look right. What you really need to do is find out the encoding that the reader will be using and then write to match that.

                Comment

                • NeoPa
                  Recognized Expert Moderator MVP
                  • Oct 2006
                  • 32662

                  #9
                  Rabbit's illustration shows how to cast a value into Unicode format - NCHAR(). This would still need to be stored, if it were stored, in a variable that is also defined as NCHAR.

                  That would indicate that your original line of code would be :
                  Code:
                  DECLARE @TAB NCHAR(1)=NCHAR(199)
                  I'm not sure what the Unicode equivalent of the C-cedilla is though. Probably not 199, but I'm sure you can find that.

                  PS. Scratch that. It is 199 (or &hC7) for the capital and 131 (or &hE7) for the lower-case. See W3 - Characters Ordered by Unicode.

                  Comment

                  • Rabbit
                    Recognized Expert MVP
                    • Jan 2007
                    • 12517

                    #10
                    Yes it is, at least for UCS-2 (which is what SQL Server uses for NCHAR) and also UTF-16, and the equivalent in ASCII for capital C cedilla is 128 and lower case c cedilla is 135. UCS-2 and UTF-16 also use 2-bytes per character whereas ASCII uses only one.

                    Comment

                    • ddtpmyra
                      Contributor
                      • Jun 2008
                      • 333

                      #11
                      i can see the cedill in right format when stored on temporary table this is good! But every time I export it to a .dat file it looks like this...



                      I wonder if the cause of these is because how i called it on my BCP

                      Code:
                      SELECT @SQL='BCP Test_File OUT '+@FILENAME+' -c -t -T -S '+@@SERVERNAME

                      Comment

                      • Rabbit
                        Recognized Expert MVP
                        • Jan 2007
                        • 12517

                        #12
                        You have two options.

                        1) Use -w instead of -c. -w will encode the text in unicode.
                        2) Use -c but also specify -C ACP to use code page 1252. Code page 1252, also known as Latin 1, is the most common code page used on windows.

                        Comment

                        • ddtpmyra
                          Contributor
                          • Jun 2008
                          • 333

                          #13
                          You are awesome NeoPa and Rabbit! It works thank you soooo much for your help!

                          Comment

                          • NeoPa
                            Recognized Expert Moderator MVP
                            • Oct 2006
                            • 32662

                            #14
                            I'm glad we could help, though TBF I think Rabbit's know-how was probably more helpful than mine for this one. You may want to decide which of the posts helped you nail it in the end and select it as Best Answer. I'd do it for you, but I can't tell where you were most stuck and which one opened the gates for you.

                            Comment

                            Working...